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(Ill)  Introducti on 


Grant  No.  AF0SR-76-3024  was  initiated  on  March  15,  1976. 
The  overall  goal  of  research  sponsored  under  this  Grant  can  be 
summarized  in  two  different  objectives: 

(1)  To  demonstrate  the  feasibility  of  optical  com¬ 
putations  for  the  implementation  of  image  band¬ 
width  compression; 

(2)  Upon  demonstration  of  feasibility,  to  carry-out 
the  research  necessary  to  advance  optical  com¬ 
putations  for  image  bandwidth  compression  to 
approximately  the  same  state  of  sophistication 
as  currently  associated  with  digital  computa¬ 
tions  for  image  bandwidth  compression. 

Achieving  these  two  objectives  required  the  undertaking  of 
several  specific  steps  in  the  course  of  the  research  sponsored 
under  the  Grant.  The  steps  were: 

(1)  Survey  optical  computation  systems  and  schemes 
to  establish  the  repertoire  of  functions  which 
could  be  carried  out  optically,  e.g.,  convolu¬ 
tions  (both  coherent  and  incoherent),  sampling, 
differentiation,  integration,  etc.; 

(2)  Establish  what  computational  processes  or  func¬ 
tions  were  useful  in  the  development  of  band¬ 
width  compression  schemes; 

(3)  Determine  architectural  configurations  for  image 
bandwidth  compression,  that  is,  try  to  find 


arrangements  of  optical  processor  components  to 
achieve  compression  calculations; 

(4)  Demonstrate,  by  means  of  simulations,  that  a 
proposed  optical  processor  architecture  would 
achieve  some  level  of  bandwidth  compression; 

(5)  Seek  improved  architectures  or  new  processor 
functions  to  increase  the  compression  perform¬ 
ance  of  candidate  systems. 

These  five  steps  were  repeated  continually  for  a  variety  of  pro¬ 
posed  optical  schemes  for  image  bandwidth  compression  algorithms. 

In  this  report,  the  final  technical  report  for  Grant  AF0SR-76-3024, 
we  present  the  results  of  the  efforts  which  were  sponsored  by  the 
Grant. 

It  is  important  to  note  that  step  (4)  in  the  above  list  is 
an  important  facet  of  our  research,  and  is  something  which  dis¬ 
tinguishes  our  research  from  other  research  in  both  optical  proc¬ 
essing  and  image  bandwidth  compression.  Every  candidate  archi¬ 
tecture  for  bandwidth  compression  by  optical  processor  was  simu¬ 
lated  by  using  a  multi-purpose,  high-performance  digital  image 
processing  facility  at  the  University  of  Arizona.  (See  Appendix  A.) 
It  is  important  to  understand  the  motivation  for  using  digital 
image  processing  in  the  development  of  optical  processing  schemes. 

Optical  processing  research  and  development  can  be  divided 
into  two  broad  categories:  the  development  of  physical  materials, 
devices,  or  systems  which  realize  specific  mathematical  functions 
or  computational  processes;  and  the  development  of  systems  archi¬ 
tectures  which  treat  the  individual  mathematical  functions  or 
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computational  processes  as  components,  and  then  assemble  the  com¬ 
ponents  into  a  high-level  system  which  achieves  some  complex 
processing  objective. 

Much  of  the  research  in  optical  processing  falls  into  the 
first  of  these  two  categories:  development  of  materials,  devices, 
or  systems  which  realize  specific  mathematical  functions  or  proc¬ 
esses.  The  concentration  of  effort  in  this  area  is  quite  impor¬ 
tant,  for  many  of  the  great  advantages  of  optical  processing  will 
not  be  realized  without,  for  example,  better  materials  to  serve 
as  spatial  light  modulators,  or  higher  bandwidth  in  the  schemes 
by  which  a  modulator  is  accessed  and  addressed.  The  research  in 
this  area  requires  painstaking  and  careful  control  of  experimental 
facilities. 

In  the  research  sponsored  under  Grant  AF0SR-76-3024,  we  have 
concentrated  in  the  second  category:  the  development  of  systems 
architures.  Because  the  devices  and  materials  for  optical  proc¬ 
essing  are  undergoing  continual  ferment,  any  attempt  to  physically 
realize  an  optical  processor  for  bandwidth  compression  could  be 
dependent  upon  the  component  devices  chosen,  e.g.,  modulator.  In 
our  research  the  use  of  digital  image  processing  to  simulate 
optical  processor  bandwidth  compression  systems  is  a  direct  con¬ 
sequence  of  our  intent  to  concentrate  upon  systems- 1 evel  concepts 
and  avoid  the  turmoil  of  specific  material  or  device  implementa¬ 
tion  technologies.  Digital  image  processing  concentrates  upon 
functional  capabilities  and  their  implementation,  and  is  much  more 
flexible.  It  is  important  to  realize,  however,  that  the  use  of 
digital  image  processing  has  been  tied  to  optical  realizability; 
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that  is,  no  digital  image  processing  functions  were  used  in  a 
simulation  unless  they  could  be  realistically  implemented  in  some 
properly  configured  optical  device. 

It  is  worthwhile  noting  that  our  course  of  investigation  has 
gone  from  systems  using  all  optical  componentry  to  systems  using 
a  mix  of  optical  and  digital  componentry.  This  is  parallel  to 
the  path  followed  by  other  workers  in  optical  processing.  As  the 
price,  performance,  and  ease  of  interfacing  of  digital  components 
and  sensors  has  improved,  more  advantages  to  including  digital 
components  in  optical  sys terns  ha ve emerged .  Hybrid  opti ca 1 / d i gi ta 1 
systems  are  being  considered  as  the  most  effective  solution  to  a 
number  of  si gnal -processi ng  problems,  and  our  work  under  Grant  No. 
AFOSR- 76- 3024  is  no  different.  The  more  complex  systems  for  image 
gandwidth  compression  which  have  been  studied  in  the  latter  phases 
of  Grant  AF0SR-76-3024  have  explicitly  involved  the  employment  of 
digital  componentry  functions. 
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(IV)  Summary  of  Important  Results 

We  will  summarize  our  important  results  first  in  terms  of 
the  general  objectives  set  forth  in  the  beginning  of  Section  III, 
the  Introduction: 

(1)  We  have  demonstrated  the  feasibility  of  a  number 
of  optical  processing  system  architectures  which 
can  be  employed  for  image  bandwidth  compression; 

(2)  Our  results  appear  to  justify  the  assertion  that 
optical  processing  for  image  bandwidth  compres¬ 
sion  can  achieve  the  same  sophistication  as  di¬ 
gital  processing  for  image  bandwidth  compression. 

We  will  summarize  the  specific  systems  which  lead  us  to  the 
results  stated  under  item  (1)  directly  above.  First,  we  consider 
the  result  asserted  in  item  (2),  since  this  is  a  proper  entry  to 
describe  the  current  state  of  digital  processing  for  image  band¬ 
width  compression,  and  to  describe  the  relationship  of  our  re¬ 
search  to  it. 

Image  bandwidth  compression  schemes  can  be  classified  in  two 
different  ways.  One  major  level  of  classification  is  in  the  do¬ 
main  where  the  compression  scheme  operates.  There  are  two  major 
divisions  to  this  classification:  space  domain  compression,  where 
computations  take  place  in  the  original  image  space  of  the  input 
data;  and  transform  domain  compression,  where  computations  are 
used  to  generate  a  transform  of  the  original  image,  and  subsequent 
compression  computations  take  place  in  this  transform  domain,  e.g., 
the  Fourier  domain.  For  either  spatial  or  transform,  the  purpose 
of  the  compression  computations  is  to  remove  the  redundancy  which 
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exists  in  the  imagery,  thereby  producing  an  image  which  is  "de- 
correlated"  in  some  sense. 

It  is  the  dimension  wherein  the  decorrelation  takes  place 
that  defines  the  other  major  level  of  classification  for  band¬ 
width  compression  schemes.  That  is,  the  characteristics  of  the 
imagery  are  used  to  determine  what  decorrelation  computations  are 
carried  out.  For  monochrome,  single-frame  imagery  only  spatial 
redundancy  can  be  eliminated.  For  polychrome,  single-frame  images 
redundancy  in  both  spatial  and  wavelength  (or  spectral)  content 
can  be  eliminated.  For  multi-frame  imagery,  with  each  frame  se¬ 
quenced  in  time,  any  temporal  redundancy  can  be  eliminated.  Com¬ 
binations  of  any  one  or  two  of  the  redundancy  measures  of  spatial, 
spectral,  or  temporal  redundancy  lead  to  different  data  compres¬ 
sion  schemes. 

Digital  image  bandwidth  compression  schemes  have  been  demon¬ 
strated  for  virtually  all  combinations  of  spatial  or  transform  do¬ 
main  processing  with  spatial,  spectral,  and  temporal  redundancy  re¬ 
duction.  It  is  this  wealth  of  results  which  is  the  mainstay  of  lit¬ 
erature  in  the  compression  of  image  bandwidth  by  digital  computations. 

A  final  sophistication  in  digital  image  bandwidth  compression 
techniques  is  whether  a  given  technique  is  adaptive  or  nonadaptive. 

A  nonadaptive  compression  scheme  processes  all  portions  of  an  image 
in  the  same  way,  i.e.,  the  operating  parameters  of  the  compression 
algorithm  are  the  same  for  all  components  of  the  image.  Conversely, 
an  adaptive  scheme  recognizes  that  the  level  of  redundancy  of  an 

image  is  not  constant  in  space,  spectral  content,  or  time;  for  most 

*See,  for  examples,  the  bandwidth  compression  chapters  of  Pratt, 
Digital  Image  Processing,  Wiley,  New  York,  1978. 
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efficient  operation,  therefore,  some  of  the  critical  operating 
parameters  of  the  bandwidth  compression  algorithm  are  adjusted  as 
a  function  of  the  behavior  of  the  imagery. 

Thus,  digital  image  bandwidth  compression  processes  can  be 
characterized  as:  spatial  domain  vs.  transform  domain;  spatial 
redundancy  reduction  vs.  spectral  redundancy  reduction  vs.  tem¬ 
poral  redundancy  reduction;  adaptive  processing  vs.  nonadaptive 
processing.  The  overall  complexity  is  graphically  displayed  in 
Figure  1.  To  add  further  to  the  complexity  of  digital  image  band¬ 
width  compression  schemes,  recall  that  any  of  the  schemes  within 
a  box  in  Figure  1  can  be  combined  with  others.  Thus,  a  scheme 
could  be  developed,  for  example,  by  combining  together  nonadaptive 
spatial  redundancy  reduction  processing  in  the  spatial  domain, 
with  adaptive  transform  domain  processing  of  the  color  or  spectral 
components,  and  nonadaptive  spatial  domain  processing  to  reduce 
temporal  redundancy. 

The  point  of  Figure  1  is  that  it  frames  the  challenge  which 
is  confronted  in  the  second  research  objective  in  the  introduction 
to  Section  III,  i.e.,  to  make  the  level  of  optical  processing  for 
image  bandwidth  compression  equal  to  that  of  digital  processing 
for  image  bandwidth  compression. 

The  research  conducted  under  Grant  AF0SR-76-3024  has  concen¬ 
trated  only  upon  the  left-half  of  Figure  1,  spatial  domain  proc¬ 
essing.  This  is  not  because  optics  are  not  suited  to  transform 
domain  processing.  On  the  contrary,  the  Fourier  transform  capa¬ 
bilities  of  coherent  optical  systems  are  well  suited  to  the  com¬ 
putations  for  data  compression  which  have  been  proven  in  the 
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context  of  digital  compression  research.  In  fact,  digital  com¬ 
pression  research  using  the  transform  domain  is  so  extensive  that 
it  would  be  direct  to  develop  a  coherent  system  for  carrying  out 
the  transform  domain  compression  process.  Such  a  system  would 
have  one  severe  problem:  the  requirement  to  employ  coherent 
(holographic)  detection  in  the  Fourier  plane  in  order  to  obtain 
both  magnitude  and  phase  components  for  the  compression  coding 
step.  However,  it  could  be  done.  Since  this  did  not  represent 
a  challenge  in  the  context  of  new  and/or  unique  system-level  ar¬ 
chitectures  for  optical  processing  in  image  bandwidth  compression, 
such  research  was  not  undertaken.  It  is  in  the  spatial  domain 
processing,  for  spatial,  spectral,  and  temporal  redundancy,  where 
our  research  was  concentrated. 

In  the  following  we  summarize  the  different  optical  proces¬ 
sing  schemes  which  were  investigated  during  the  course  of  research 
sponsored  under  Grant  No.  AF0SR-76-3024.  The  details  of  the 
actual  processing  schemes  and  the  simulations  are  contained  in  the 
series  of  appendices  affixed  to  this  report.  The  appendix  which 
documents  a  scheme  summarized  below  should  be  consulted  for  spe- 
ci f i cs  . 

(IV. 1)  Interpolated  DPCM 

The  simplest  digital  data  compression  scheme  for  single-frame 
monochrome  imagery  is  Differential  Pulse  Code  Modulation  (DPCM). 
The  first  research  undertaken  for  the  purposes  of  Grant  AFOSR-76- 
3024  was  to  construct  a  compression  system  which  was  the  optical 
analogy  of  digital  DPCM.  In  the  process  new  insight  was  gained 
into  digital  DPCM  as  well. 


The  basis  of  digital  DPCM  is  prediction.  As  the  imagery  is 
scanned  ( lef t-to-right,  top-to-bottom) ,  the  pixels  in  a  given 
causal  neighborhood  (called  the  prediction  nei ghborhoood)  are 
combined  to  form  an  estimate  of  the  pixel  which  is  next  to  be 
encountered  in  the  scan.  The  difference  between  the  estimate 
and  the  actual  pixel  value  is  then  quantized  and  coded.  DPCM 
works  because,  except  in  the  neighborhood  of  abrupt  slopes,  an 
estimate  will  be  accurate.  Hence,  the  differential  will  be  small 
and  can  be  coded  with  a  small  number  of  bits. 

The  development  of  an  optical  analogy  required  releasing  the 
causality  constraint  associated  with  an  imagery  scan,  since  opti¬ 
cal  systems  are  noncausal.  In  doing  this,  we  replaced  the  pixel 
estimate  with  a  neighborhood  smoothing,  and  computation  of  a  dif¬ 
ferential  between  the  smoothed  and  actual  pixel  values.  The 
differential  proved  to  be  small  and  was  coded  as  in  standard  DPCM. 
Optical  system  configurations  which  would  realize  this  new  scheme, 
called  interpolated  DPCM,  were  configured  on  the  basis  of  inco¬ 
herent  processors  and  video  electronics. 

Appendix  B  is  a  paper  which  described  the  IDPCM  process  in 
some  detai 1 . 

(IV. 2)  Incoherent  Feedback  Video  Processor 

The  noncausal  optical  system  for  IDPCM  was  then  examined  from 

the  viewpoint  of  a  generalization.  It  can  be  shown*  that  DPCM  can 

be  formulated  as  a  temporal  feedback  scheme.  A  generalization  of 

this  scheme  to  imagery  would  require  noncausal  convolutions  and 

*$ee,  for  example,  the  book  by  Pratt,  Digital  Image  Processing, 
Wiley,  New  fork ,  1978. 
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image-plane  to  image-plane  feedback.  An  architecture  to  do  this 
can  be  postulated  as  in  Figure  2,  where  we  show  the  general  struc 
ture  of  causal  DPCM  and  an  architecture  for  noncausal  image-plane 
to  image-plane  differences  in  an  analogous  fashion. 

In  theory  the  process  in  the  lower  half  of  Figure  2  can  be 
realized  with  coherent  processing,  with  beam  splitters  and  phase 
adjustments  serving  as  sums  and  differences;  the  quantization 
process  could  be  realized  by  any  of  the  nonlinear  methods  studied 
by  Sawchuk*.  However,  the  practical  implementation  of  this  proc¬ 
ess  represents  great  difficulty,  chiefly  due  to  phase  coherence. 
In  even  a  miniaturized  version  of  the  scheme  with  integrated 
optics,  the  distance  around  the  feedback  paths  will  be  very  great 
compared  with  the  wavelengths  of  light.  The  phase  of  signals  at 
the  difference  and  sum  planes  cannot  be  controlled,  as  a  result, 
without  resorting  to  interferometric  precision. 

Despite  the  difficulty  of  the  feedback  architecture,  it  was 
promising  to  investigate  whether  the  phase  problems  could  be 
solved.  One  direct  choice  is  to  employ  video  systems.  By  imag¬ 
ing  onto  video  sensors  and  using  the  video  signals  for  image- 
plane  to  image-plane  operations  the  phase  problem  related  to  dis¬ 
tance  around  the  loop  is  solved,  since  video  systems  can  operate 
at  wavelengths  which  are  long  compared  to  the  feedback  path.  A 
new  problem  is  introduced,  however,  by  the  temporal  scanning  in 
a  video  system.  Phase  stability  in  spatial  distance  around  the 
loop  is  exchanged  for  temporal  stability  caused  by  the  time  delay 
to  scan  one  frame. 

♦Sawchuk  and  Dashiell,  SPIE  Proceedings  on  Image  Processing,  Vol. 

74,  p.  93,  1976. 


Figure  2-a.  Causal  DPCM 


Figure  2 
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We  chose  to  investigate  the  problems  of  a  sampled-data  imple¬ 
mentation  of  an  incoherent  optical  video  feedback  processor.  We 
also  chose  to  simulate  the  operation  of  such  a  system.  Our  analy¬ 
sis  is  given  in  Appendix  C.  Our  simulations  showed  the  concept 
was  feasible,  but  fraught  with  sufficient  difficulties  to  make  it 
unattractive.  This  line  of  research  was  discontinued. 

(IV. 3)  Optical  Image  Bandwidth  Compression  in  the  Eye 

The  work  which  we  conducted  in  the  development  and  evaluation 
of  IDPCM  was  further  extended  to  an  interesting  insight  into  the 
human  visual  system.  Baced  on  a  number  of  facts  about  the  physi¬ 
ology  of  the  human  visual  system,  it  is  possible  to  recognize  that 
the  structure  of  the  IDPCM  bandwidth  compression  processor  is  quite 
similar  to  equivalent  structure  in  the  eye.  Furthermore,  both 
processes  are  suited  for  operation  in  incoherent  light,  which  is 
an  advantage  for  IDPCM  and  an  absolute  necessity  for  the  human  eye. 

Once  this  analogy  was  recognized,  research  was  undertaken  to 
examine  the  extent  to  which  the  processing  of  information  by  the 
human  visual  system  could  be  interpreted  in  a  bandwidth  compression 
context.  Our  results  in  this  research  are  summarized  in  Appendix  D, 
a  paper  discussing  a  two-channel  model  of  human  vision  and  its  di¬ 
rect  relationship  to  the  IDPCM  compression  technique  presented  in 
Appendix  B. 

(IV. 4)  Spline  Interpolation  for  Image  Bandwidth  Compression 

With  the  success  of  the  IDPCM  scheme,  which  is  nonadaptive 
and  operates  with  incoherent  light,  the  next  step  in  complexity 
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was  to  introduce  some  degree  of  adaptive  processing  into  the 
optical  processors  for  bandwidth  compression.  Our  research  in 
this  area  was  motivated  by  recent  work  in  the  adaptive  compres¬ 
sion  of  imagery  by  B-spline  functions.  The  B-spline  functions 
have  a  local  property  and  can  be  constructed  from  repeated  con¬ 
volutions  (which  an  optical  processor  can  implement  directly). 
Likewise,  an  analysis  of  the  B-spline  approximation  problem  shows 
that  least-squares  fitting  of  data  can  be  derived  by  simple  inte¬ 
gration  and  differentiation  processes,  which  an  optical  processor 
can  also  implement.  Since  any  of  these  processes  require  coherent 
operation,  however,  the  resulting  processor  must  operate  in  coher¬ 
ent  light  and  an  i ncoherent- to-coherent  conversion  device  (such 
as  Hughes  liquid  crystal)  becomes  necessary. 

Appendix  E  contains  the  detailed  theory  of  an  optical  proc¬ 
essor  scheme  which  can  adapt  its  compression  behavior  to  the  spe¬ 
cific  properties  of  the  image  in  a  localized  region.  It  is  a 
complicated  processor,  unfortunately.  Although  our  simulations 
indicate  that  such  a  processor  would  have  high  performance  in 
image  bandwidth  compression,  we  believe  a  viable  adaptive  proc¬ 
essor  must  be  more  simple.  This  line  of  research  was  discontin¬ 
ued,  as  a  result. 

(IV. 5)  Adaptive  Bandwidth  Compression  Scheme 

Since  the  spline  processor  achieved  adaptive  compression  be¬ 
havior  only  at  the  expense  of  great  complexity,  a  search  for  a 
much  simpler  scheme  was  undertaken.  This  led  to  the  development 
of  a  scheme  which  utilized  known  behavior  of  the  visual  system: 
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sensitivity  to  edges.  Many  judgements  about  image  quality,  and 
many  tasks  in  extraction  of  image  information,  are  highly  depen¬ 
dent  upon  the  edge  content  of  the  image.  An  image  can  be  rendered 
poor  in  contrast  or  resolution,  but  preservation  of  edges  will 
still  make  the  imagery  suitable  for  many  human  purposes. 

The  adaptive  scheme  was  developed  on  a  requirement  to  pre¬ 
serve  edges.  A  simple  edge  filter  (which  can  be  implemented  op¬ 
tically)  is  created  by  convolution.  The  existence  of  an  edge  in 
the  image  is  assumed  if  the  edge  strength  exceeds  a  certain  thresh¬ 
old.  The  position  of  the  edges  are  coded  by  a  run-length  code, 
along  with  the  edge  values.  A  low-pass  filtered  version  of  the 
image  (which  can  also  be  implemented  optically)  is  transmitted 
with  little  bandwidth  requirement.  The  edges  are  inserted  into 
the  low-pass  version,  and  the  result  is  an  image  with  soft-shapes 
that  possess  sharp  edges.  The  reconstructed  images  possess  nearly 
all  the  information  used  in  typical  information  extraction  tasks- 
e.g.,  identification  of  objects. 

Appendix  F  is  a  paper  which  describes  the  adaptive  scheme 
resulting  from  this  edge  detection  process. 

(IV. 6)  Interframe  Bandwidth  Compression 

Interframe  imagery,  i.e.,  imagery  from  a  temporal,  sequential 
image  source  such  as  television,  represents  a  new  level  of  com¬ 
plexity  in  the  use  of  optical  processors  for  image  bandwidth  com¬ 
pression.  The  problem  is  that  temporal  redundancy  cannot  be 
eliminated  without  storage  of  one  or  more  frames  prior  to  the 
current  frame.  It  is  only  possible  to  identify  temporal  redundancy 


t  -ia ft  r^W'  rf  i 


JfiS Urfu 


.'V. 


19 


in  the  context  of  the  change  from  frame-to-frame.  .fortunately, 
frame  storage  is  difficult  in  an  optical  processor  system.  Al¬ 
though  electro-optical  devices  exist  which  can  store  a  frame, 
they  are  of  inferior  quality  when  compared  to  some  of  the  digital 
devices  which  have  been  recently  developed,  e.g.,  digital  semi¬ 
conductor  frame  buffer  memories.  Some  new  devices,  such  as  CCD's, 
offer  analog  sensing  and  digital  addressing  in  simple,  integrated 
devices.  Consequently,  our  research  in  interframe  data  compres¬ 
sion  has  been  on  the  basis  of  hybrid  opt i ca 1 / d i gi ta 1  operation. 

The  employment  of  digital  frame  buffers  to  serve  as  image  storage 
is  the  simplest  way  to  allow  for  the  temporal  processing  align¬ 
ment. 

Appendix  G  is  a  paper  describing  our  system  proposed  for  in¬ 
terframe  compression.  Basically,  it  consists  of  an  optical  com¬ 
pression  component  utilizing  the  IDPCM  process  discussed  in  Ap¬ 
pendix  B.  Since  multiple  frame  imagery  is  assumed,  the  output  of 
the  IDPCM  processor  is  captured  in  frame  buffer  units,  and  con¬ 
ventional  DPCM  processing  between  frames  is  used  to  reduce  the 
redundancy  between  frames. 
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( V )  Cone! usi ons 


We  be! i ve  the  papers  and  results  set  forth  in  detail  in  the 
Appendices  justify  the  conclusion  that  we  have  demonstrated  the 
feasibility  of  using  optical  processors  for  image  bandwidth  com¬ 
pression.  This  was  the  first  objective  which  we  set  forth  in  the 
introduction.  Section  II.  What  of  our  second  objective,  to  raise 
the  sophistication  of  optical  processing  methods  to  the  level  en¬ 
joyed  by  digital  bandwith  compression?  As  can  be  seen  from  Fig¬ 
ure  1  of  Section  IV,  we  need  additional  research  to  establish  a 
breadth  of  variety  in  optical  computations  for  image  bandwidth 
compression,  comparable  to  digital  systems.  In  particular,  band¬ 
width  compression  for  multi -spectral  imagery,  better  interframe 
compression  performance,  and  adaptive  compression  are  all  goals 
for  our  future  research  in  this  area. 
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THE  DIGITAL  IMAGE  ANALYSIS  LABORATORY 

The  Digital  Image  Analysis  Laboratory  (DIAL)  is  located  in  the  Engineer¬ 
ing  Building  on  the  University  of  Arizona,  campus.  The  DIAL  facility  is  the 
focus  on  the  Arizona  campus  of  research  involving  the  processing,  manipulation, 
and  analysis  of  imagery  by  digital  computers.  Basic  research  is  also  carried- 
out  in  technologies  that  support  digital  image  analysis,  such  as  signal  proc¬ 
essing  and  numerical  techniques. 

Besides  the  use  of  DIAL  facilities  in  programs  of  sponsored  research, 

DIAL  figures  prominently  in  the  education  of  graduate  students.  DIAL  facili¬ 
ties  are  used  for  Instruction  in  Remote  Sensing,  Optical  Sciences,  and  Systems 
Engineering.  Students  from  these  Departments  (as  well  as  others)  regularly 
work  in  the  DIAL  facility. 

DIAL  Resources 

The  resources  within  DIAL  consist  of  equipment,  programs,  and  faculty  of 
the  University  of  Arizona. 

(1)  Equipment 

DIAL  equipment  resources  consist  of  hardware  physically  located  within 
the  laboratory,  and  of  hardware  outside  of  the  laboratory  that  can  be  accessed 
remotely.  Equipment  with  DIAL  consists  of  the  following: 

-  PDP-11/70  Computer  with  192K  words  of  storage,  cache  memory, 
floating  point  processor. 

-  RM03  Disc  Storage  Unit,  with  67  Megabytes  of  memory. 

-  TU-10  Magnetic  Tape  Drive,  9  track  800  BP I. 

-  DZ-11  8-line  Multiplexer,  with  4  CRT  terminals,  one  printing 
terminal,  one  remote  dial-up  phone  line  modem  terminal. 

-  LP-11  Line  Printer. 

-  Stanford  Technology  Corporation  Image  Display  Unit:  512  x 
512  pixels  x  8  bits/pixel  x  3  colors  (Red,  Green,  Blue), 
with  Graphics,  Feedback  Arith. /Logic  Unit,  Precision  color 
CRT  display  monitor,  interactive  trackball  control. 

Equipment  accessed  remotely  from  DIAL  includes  the  following: 

-  DEC-10/CDC  Cyber  175  Computers  in  the  University  Computer 
Center;  access  from  DIAL  is  a  permanent  1200  baud  telephone 
terminal  modem  set. 


-  VAX-1 1/780  Computer  in  the  Department  of  Radiology;  access  is 
a  19.2  k  baud  telephone  modem  set  between  the  VAX  and  11/70 
CPU’s. 

(2)  Software/Programs 

DIAL  software  resources  consist  of  special  programs  written  by  equipment 
manufacturers  and  programs  created  for  more  general-purpose  image  processing. 
Software  resources  include: 
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-  SADIE,  a  general-purpose  image  processing  software  package 
written  at  the  University  of  Arizona  and  distributed  by  the 
University  of  Minnesota  to  a  dozen  different  sites.  SADIE 
is  available  in  both  a  CDC  Cyber  175  version  and  a  PDP-11/70 
version; 


-  FACEL,  a  general  purpose  pattern  recognition  package; 

-  System  511,  an  interactive  image  processing  package  solely 
for  the  Stanford  Technology  Corporation  image  display  system. 

(3)  Faculty 

Faculty  working  in  DIAL,  or  associated  with  DIAL  projects,  cover  a  variety 
of  academic  disciplines.  Principals  currently  Involved  include: 

8.  R.  Hunt,  Professor  of  Systems  Engineering  and  Professor 
of  Optical  Sciences; 

P.  N.  Slater,  Professor  of  Optical  Sciences  and  Chairman 
of  Remote  Sensing  Programs; 

J.  J.  Burke,  Professor  of  Optical  Sciences; 

R.  Schowengerdt,  Assistant  Professor  of  Remote  Sensing. 

Some  DIAL  Activities 

A  variety  of  sponsored  research  projects  are  currently  underway  within 
DIAL.  Examples  of  the  current  projects  are: 

-  Simulation  of  optical  processing  for  image  data  compression; 

-  Investigation  of  factors  influencing  the  automatic  compilation 
of  maps  from  aerial  photographs; 

-  Study  Group  to  examine  the  feasibility  of  automating  image 
processing; 

-  Geometric  correction,  rectification,  and  editing  of  images  of 
the  planet  Saturn  from  the  Pioneer  spacecraft; 

-  Generation  and  display  of  imagery  for  ranking  of  image  quality 
cri teri a ; 

-  Use  of  LANOSAT  Imagery  for  determination  of  agricultural  plant¬ 
ing  patterns  in  Avra  Valley,  Arizona. 

Further  Inquiries 

For  further  Information,  Inquiries  concerning  research  projects,  sponsor¬ 
ship,  or  use  of  DIAL  facilities,  please  contact: 


Professor  B.  R.  Hunt 
DIAL 

Department  of  Systems  Engineering 
University  of  Arizona 
Tucson,  Arizona  85721 
(602)626-5157 
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OPTICAL  COMPUTING  FOR  IMAGE  BANDWIDTH 
COMPRESSION:  ANALYSIS  AND  SIMULATION 


Reprinted  from  APPLIED  OPTICS,  VoL  17,  page  2944,  September  16, 1978 
Copyright  1978  by  the  Optical  Society  of  America  and  reprinted  by  permiaaion  of  the  copyright  owner. 
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Optical  computing  for  image  bandwidth  compression: 
analysis  and  simulation 

B.  R.  Hunt 

Image  bandwidth  compraaaion  it  dominated  by  digital  methods  for  carrying  out  the  required  computatiooa. 
Tbit  paper  dierumee  the  general  problem  of  using  optica  to  realist  the  computations  in  bandwidth  compres¬ 
sion.  A  common  method  of  digital  bandwidth  compraaaion,  feedback  differential  pubs  cod*  modulation 
(DPCM),  is  reviewed,  and  the  obstacles  to  making  a  direct  optical  analogy  to  feedback  DPCM  are  diannaad 
Instead  of  a  direct  optical  analogy  to  DPCM,  an  optical  system  which  captures  the  essential  features  of 
DPCM  without  optical  feedback  is  introduced.  The  essential  features  of  this  incoherent  optical  system  are 
encoding  of  low-ftsqusncy  information  and  generation  of  difference  samples  which  can  be  coded  with  a  small 
number  of  bits.  A  simulation  of  this  optical  system  by  means  of  digital  image  proraming  is  presented,  and 


performance  data  are  also  included. 


L  Introduction 

When  digital  image  processing  methods  were  initially 
employed  ten  to  fifteen  years  ago,  the  rationale  often 
voiced  in  making  the  decision  to  use  digital  computa¬ 
tions  was  the  flexibility  of  the  computer  and  the  relative 
inflexibility  of  optics.  Thus,  it  was  often  argued,  the 
digital  computer  offered  the  means  to  explore  and 
simulate  a  variety  of  system  configurations.  Once  the 
final  optimum  configuration  became  known,  this  par¬ 
ticular  one  could  be  frozen  and  implemented  in  optics, 
which  have  the  virtues  of  parallel  computations  and 
wide  space-bandwidth  product  This  early  rationale 
for  digital  image  processing  is  heard  with  much  less 
frequency  nowadays.  The  revolution  in  semiconductor 
electronics  has  produced  cheap,  fast  and  reliable  digital 
systems.  New  digital  algorithms,  such  as  the  fast 
Fourier  transform,  have  made  digital  image  processing 
an  active  and  fruitful  endeavor,  which  is  carried  on  for 
its  own  purposes  and  without  reference  to  flexibility  in 
simulating  optics.  In  this  recent  burst  of  activity  in 
digital  image  processing,  a  remembrance  of  the  early 
rationale  brings  to  mind  the  question:  are  there  optical 
computations  in  image  processing  which  are  being 
overlooked  in  the  successes  of  digital  processing? 

Image  data  compression  is  an  example  of  the  success 
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of  digital  image  processing.  The  motivation  for  data 
compression  is  the  great  amount  of  information  that  can 
exist  in  an  image.  Even  a  low-quality  image,  such  as 
might  be  produced  by  a  pocket  camera,  can  cnntniw 
KF-IO*  hits  of  information,  and  a  high  quality  image 
can  contain  several  orders  of  magnitude  more  bits  of 
information.  The  transmission  and  storage  of  such 
masses  of  data  are  difficult,  and  anything  which  can  be 
done  to  eliminate  data  redundancy  is  of  interest,  since 
there  will  be  a  concurrent  reduction  in  the  requirements 
of  transmission  bandwidth,  storage,  and  system  costa. 
Given  this  motivation,  image  data  compression  has  been 
one  of  the  most  successful  applications  of  digital  image 
processing.  As  can  be  seen  from  existing  survey  papers 
on  image  data  compression,1-3  a  variety  of  different 
methods  have  been  investigated  and  shown  to  be  suc¬ 
cessful  The  success  of  digital  computations  can  be  seen 
in  efforts  currently  underway  to  build  and  teat  proto¬ 
types  of  all-digital  compression  systems  for  military 
applications.44 

As  favorable  as  the  performance  characteristics  of 
current  digital  components  are,  the  potential  of  optical 
computations  should  not  be  overlooked.  There  is  still 
merit  in  analog  signal  processing  when  the  teak  is 
properly  defined.  In  this  paper  we  discuss  the  appli¬ 
cation  of  optical  methods  to  the  problem  of  image  data 
compression.  We  will  consider  the  computations  em¬ 
ployed  in  image  data  compression  from  the  viewpoint 
of  bow  those  computations  may  be  realized  by  optical 
processes.  In  addition,  a  structure  suitable  for  optical 
implementation  of  image  data  compression  will  be 
presented.  The  results  of  simulating  this  proposed 
optical  system  by  digital  image  processing  will  be  pre¬ 
sented.  (This  brings  things  to  full  circle,  Le^  simulating 
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Fig.  1-  Schematic  of  DPCM  d%- 
ital  comnmiinr 


possible  optical  systems  by  digital  processes!)  Thus, 
this  paper  has  two  aims:  to  discuss  the  applicability  of 
optical  computations  in  image  data  compression  and  to 
present  results  of  simulating  such  computations. 

R.  DPCM  Digital  Image  Data  Compression 

In  approaching  the  problem  of  performing  image  data 
compression  by  optical  computations,  we  begin  by  a 
brief  review  of  the  principles  behind  the  simplest  of 
image  compression  schemes,  differential  pulse  code 
modulation  (DPCM).  This  review  is  necessary  to  mo¬ 
tivate  the  optical  analogies  we  present  below. 

The  conventional  structure  of  a  DPCM  image  data 
compression  system  is  seen  in  Fig.  1.  The  data  com- 
preaaion  process  takes  place  in  the  left-hand  feedback 
loop,  and  reconstruction  occurs  in  the  loop  at  the  right 
The  basic  equations  are  the  following.  The  past  N 
samples  of  previously  predicted  data  are  used  in  a  linear 
predictor  to  generate  a  prediction  of  the  current  sam¬ 
ple 

hm  Z  hjjk-i,  (l) 

>■» 

where  A  is  the  index  of  the  current  sample,  hj  is  the 
prediction  weight  given  to  each  of  the  previous  pre¬ 
dicted  samples,  and  /*  is  the  prediction  of  the  value  of 
the  Ath  sample.  (From  Fig.  1,  we  see  that/*  ■  /*  +  gk, 
which  is  the  relation  between  a  predicted  pixel  and  the 
quantized  value  of  a  predicted  pixel)  The  difference 
between  the  Ath  sample  and  the  predicted  value  of  the 
Ath  sample  is  computed 

«) 

and  this  difference  is  quantized 

tk  -  <?(*).  «) 

where  Q  is  the  quantizer  function.  The  quantized 
difference  gk  is  then  coded  for  transmission  as  well  as 
being  fed  back  around  the  quantizer,  where  it  becomes 
input  to  the  prediction  computations.  As  Fig.  1  shows, 
the  reconstruction  process  consists  of  a  positive  feed¬ 
back  and  combination  of  the  decoded  differences  with 
the  output  of  a  prediction  computation  which  is  iden¬ 
tical  to  that  in  the  original  compression  loop. 

The  workings  of  DPCM  are  extremely  simple  but  are 
sometimes  difficult  to  understand  because  of  the  feed¬ 


back  occurring  around  the  quantizer.  To  show  the 
basic  workings  of  DPCM  we  resort  to  a  z  tranaform 
analysis.  First  we  replace  the  quantizer  in  the  com¬ 
pression  system  by  the  injection  of  an  additive  source 
of  quantization  noise  n*  as  seen  in  Fig.  2.  That  is,  the 
quantizer  is  modeled  as  a  Unear  addition  of  noise  n*  to 
the  Ath  input  of  the  quantizer.  The  replacement  of  a 
quantizer  by  an  additive  noise  source  is  a  standard  as¬ 
sumption  in  digital  signal  processing.6  We  use  the  z 
transform  notation  that 


and  likewise  for  other  symbols.  In  Fig.  2  we  have  two 
feedback  loops,  one  at  node  1  and  one  at  node  2.  For 
node  1  it  is  direct  to  write  the  relation 

/*  ■  L  fb(Mk-j + fk-fh  W 

H 

Using  the  z  transform  property  of  discrete  convolutions6 
(identical  to  that  for  Fourier  transforms  and  continuous 
convolutions),  we  have 

#(*)-H(t)[C(*)  +  FuU.  («) 

Solving  for  P(z)  the  result  is 

.  HU)G(i)  _ 

Likewise  at  node  2  we  have  (in  z  transforms) 

GU)  -  FU)  -  fti)  +  Nit)  -  FU)  -  *  Wi.  <» 

1-*W 

Solving  for  G(z)  we  have  the  result 

G(t)  -  (X  -  Jf(*)llFC*>  +  JV(I)J.  m 


0  P* 

*•  •>  Jl  v 


Fi*.  2.  lUplacuMDt  of  quantiser  by  additive 
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Fig.  3.  Alternate  configuration  with  quantizer  outside  of  feedback 
loop. 


The  reconstruction  system  at  the  receiver  is  directly 
seen  to  have  the  z  transform  description  (assuming 
there  are  no  transmission  or  decoding  errors) 


/<*)« 


O(z) 

l  —  «(*)' 


and  the  resulting  output  of  the  over-all  system  is 


(10) 


Frit)  -  Fiz)  +  N{z).  (11) 

We  see  that  the  total  compression  system  adds  only 
quantization  noise  to  the  reconstructed  data.  We  also 
see  the  importance  of  the  structure  of  the  compression 
loop.  If  the  quantizer  were  outside  the  feedback  loop 
as  in  Fig.  3,  the  equivalent  of  Eq.  (9)  would  be 

G(z)  -  11  -  m*)]F(t)  +  Nit).  (12) 

and  the  reconstructed  data  would  have  the  z  trans¬ 
form 


Pm-fm+’—J—  (13) 

l-ff(z) 

The  differences  between  Eqs.  (11)  and  (13)  can  be 
used  to  highlight  the  basic  operations  of  DPCM.  The 
optimal  design  of  the  predictor  weights  hj  can  be  carried 
out  by  minimizing  the  mean-square-error  of  prediction.2 
Since  it  is  possible  to  model  images  as  Markov  processes 
of  small  finite  order  (e.g.,  third  order),  the  optimal 
weights  for  hj  usually  result  in  a  low-pass  transfer 
function.7  The  transfer  function  1  -  H(z)  is  thus  a  high 
pass  transfer  function.  This  is  expected  behavior,  since 
an  accurate  prediction  /*  of  the  current  sample  /* 
implies  that  the  difference  d*  will  consist  only  of  fea¬ 
tures  that  cannot  be  predicted,  Le.,  high  frequency 
image  details  that  are  not  predictable  from  the  general 
low-frequency  trends  in  the  image  data.  Further,  since 
[1  -  H(z)\  is  a  high  pass  process,  the  quotient  l/[l  — 
H(z)\  will  have  appreciable  amplitude  at  low  frequen¬ 
cies,  decreasing  at  higher  frequencies,  i.e.,  it  is  the  fre¬ 
quency  characteristic  of  an  integrating  filter.  In  Eq. 
(11),  the  quantizer  being  within  the  feedback  loop 
causes  the  transfer  function  of  the  integrating  filter  to 
be  completely  canceled.  In  Eq.  (13),  however,  the  in¬ 
tegrating  filter  is  actually  integrating  the  quantization 
noise  throughout  the  image  data,  and  the  actual  re¬ 
construction  errors  are  much  greater. 

The  salient  points  of  DPCM  compression  can  be 
summarized  in  the  following  three  properties: 

(1)  The  prediction  and  differencing  steps  remove 
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those  portions  of  the  image  which  are  predictable  oo  the 
basis  of  past  samples,  i.e.,  the  low-frequency  informa¬ 
tion.  This  low-frequency  information  is  recreated  in 
the  reconstruction  process  by  the  integrating  filter. 

(2)  If  the  prediction  is  accurate,  the  difference  d*  will 
be  much  smaller  in  absolute  magnitude  than  the  in¬ 
coming  data  samples  /*.  Since  the  quantization  noise 
is  directly  proportional  to  the  variance  of  the  input,8  the 
error  induced  by  quantization  of  the  smaller  amplitude 
differences  is  decreased.  For  example,  if  the  original 
samples  /*  spanned  a  range  of  1-1000  (on  some  arbitrary 
scale),  quantization  accuracy  of  one  part  in  1000  would 
require  10  bits  of  quantization.  However,  if  the  pre¬ 
diction/difference  process  yields  values  d*  with  a  range 
of  1-10,  the  same  absolute  quantization  accuracy  could 
be  obtained  with  4-bit  quantization  and  with  appre¬ 
ciable  reduction  in  code  bits. 

(3)  The  DPCM  process  is  causal,  i.e.,  the  nature  of 
prediction  implies  an  ordering  to  the  samples.  This 
ordering  is  shown  as  1-D  in  the  equations  and  diagrams 
above,  but  a  2-D  image  is  naturally  ordered  by  a  raster 
scan  process  such  as  used  in  conventional  video  systems. 
In  a  raster  scan  the  2-D  data  available  to  the  prediction 
process  are  characterized  as  data  to-the-left-and-above 
the  current  position  of  the  scanning  spot  (assuming  a 
conventional  top-to-bottom  left-to-right  scanning 
system). 

III.  Dirac!  Optical  Analogies  to  Digital  DPCM 

Given  the  three  properties  of  digital  DPCM  sum¬ 
marized  above,  it  is  immediately  evident  that  one 
property  of  the  digital  process  is  not  applicable  to  the 
optical  case:  the  property  of  causality  and  the  related 
processes  of  prediction.  Causality  is  inherent  in  the 
prediction  process,  i.e.,  the  past  samples  are  used  to 
compute  a  prediction  of  the  next  sample,  a  process 
which  assumes  that  the  next  sample  is  unknown  until 
the  elapse  of  the  next  sample  time.  The  close  tie  be¬ 
tween  this  process  and  the  raster  scanning  of  an  image 
is  obvious. 

Optical  image  formation  occurs  simultaneously  over 
the  image  plane,  and  raster  scanning  of  the  image  is 
artificially  a  causal  ordering  upon  the  totally  parallel  or 
noncausal  process  of  image  formation.  A  naive  ap¬ 
proach  to  constructing  an  optical  analogy  to  DPCM 

t* 


Fig.  4.  Cohwsnt  optical  analogy  to  DPCM  compiling 


would  bo  to  replace  the  causal  system  presented  in  Fig. 
1  by  a  noncausal  system  with  the  same  structure,  Le.,  a 
system  having  the  same  structure  by  functioning  in  the 
parallel  opticsd  mode.  Such  a  hypothetical  processor 
can  be  seen  in  Fig.  4.  It  is  a  coherent  optical  system 
with  feedback.  Fully  parallel  combination  of  images 
serve  to  create  negative  or  positive  feedback.  In  addi¬ 
tion,  a  fully  parallel  optical  quantiser  is  included  within 
the  feedback  loop. 

The  equations  which  describe  such  a  system  can  be 
written  from  the  diagram  of  Fig.  4.  It  is  direct  to  show 
that  the  equilibrium  output  (Le.,  the  output  after  an 
initial  light  wave  passes  around  the  system,  and  it  set¬ 
tles)  is  given  by 

t(*j) m  <?[/<*.y)  -  J*  j*  Pi*  -*uy  - 

(14) 

where  p(xo')  is  the  spread  function  of  the  filter  in  the 
lower  loop,  Le., 

and  Q  is  the  quantizer  function.  Equations  (14)  and 
(15)  describe  the  space-domain  output  of  the  noncausal 
compression  system,  Le.,  the  image  plane  emerging  from 
the  compression  system  is  described  in  the  space  do¬ 
main  by  the  solution  of  the  nonlinear  integral  Eq.  (14), 
given  the  spread  function  defined  in  Eq.  (15). 

What  can  be  stated  about  the  feasibility  of  a  system 
such  as  seen  in  Fig.  4?  The  following  points  are  worth 
noting: 

First,  the  coherent  Alters  in  both  compression  and 
reconstruction  systems  represent  the  best  defined 
technology,  and  methods  to  achieve  the  filter  response 
are  described  in  detail  in  the  literature. 

Second,  the  fully  parallel  optical  quantizer  is  an 
emerging  technology.  The  nonlinear  optical  methods 
of  Dashiell  and  Sawchuk9  would  be  applicable  to  the 
synthesis  of  the  quantizer,  but  the  utilization  of  such  a 
nonlinear  element  integrated  into  a  system  with  other 
optics,  including  feedback,  raises  difficult  questions,  Le., 
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errors  in  the  optical  quantization  process  and  cumula¬ 
tive  effects  of  these  errors  after  many  trips  around  the 
loop,  when  the  feedback  system  settles  into  its  equi¬ 
librium  state. 

Thud,  optical  feedback  has  been  found  to  be  possible 
only  by  painstaking  measures  to  ensure  component  ri¬ 
gidity  and  stability,  freedom  from  random  perturba¬ 
tions  in  the  optical  path,  control  of  the  coherence  length 
of  the  illumination,  etc.10*11  The  problems  arise  be¬ 
cause  the  system  shown  in  Fig.  4  is,  in  essence,  an  image 
fuming  interferometer.  The  problems  associated  with 
it  are  not  necessarily  insurmountable,  but  they  are  not 
to  be  taken  lightly. 

IV.  Cosiprsssloa 

At  the  dose  of  Sec.  II,  three  important  points  of  dig¬ 
ital  DPCM  were  stated.  11)0  third  point,  causality,  was 
dealt  with  in  the  conceptual  formulation  of  Fig.  4,  Le., 
the  causality  constraint  of  digital  DPCM  was  eliminated 
by  a  noncausal  optical  system  having  a  schematic  block 
diagram  equivalent  to  digital  DPCM.  We  now  reex¬ 
amine  the  other  two  points.  DPCM  works  because  (1) 
low  frequencies  in  the  imjge  can  be  predicted,  removed 
from  transmission  by  a  difference  step,  and  then  re¬ 
created  at  the  receiver;  and  (2)  the  high  frequencies, 
which  are  not  eliminated  by  prediction  and  differencing, 
possess  only  a  small  amount  of  the  total  image  energy 
and  can  be  accurately  coded  with  a  small  number  of  bits. 
Given  these  two  points,  the  development  of  optical 
analogies  to  digital  DPCM  data  compression  can  be 
perceived  in  terms  of  the  coding  of  low-frequency  in¬ 
formation,  and  the  coding  of  the  differences  between 
low-frequency  and  high  frequency  information. 

There  are  many  ways  to  encode  the  low-frequency 
information.  A  key  feature  is  to  encode  the  low-fre¬ 
quency  information  in  such  a  way  that  low-frequancy 
image  details  can  be  reconstructed  et  the  receiver 
without  undue  amplification  of  the  quantization  noiae. 
It  is  this  requirement  which  makes  nsceaaaiy  placing  the 
quantizer  inside  the  feedback  loop  in  Fig.  4  and  makes 
analyzing  the  system  subsequently  difficult  (Recall 
the  discussion  associated  with  Eq.  (13)  to  see  that 
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placing  the  quantizer  outside  of  the  feedback  loop  would 
amplify  the  quantization  noise.]  We  choose  to  encode 
the  low-frequency  information  by  sampling  and  inter¬ 
polation,  since  these  processes  are  simple  to  implement 
optically. 

Figure  5  ahowB  a  simple  incoherent  optical  system  for 
the  process  of  image  data  compression.  An  image  of  the 
scene  is  formed  on  a  mask,  which  is  an  opaque  screen 
with  small  holes  periodically  spaced  in  it  The  mask 
functions  as  an  optical  sampling  element  The  samples 
extracted  from  the  image  by  the  mask  are  then  optically 
interpolated  to  create  a  low-frequency  version  of  the 
original  image.  The  interpolation  is  accomplished  in 
a  classical  incoherent  convolution12  using  an  apodized 
aperture  and  a  misfocused  lens.  Then  the  low-fre¬ 
quency  version  created  by  interpolation  is  subtracted 
from  the  original  unsampled  in-focus  image.  Finally, 
the  samples  from  the  mask  and  the  difference  image  are 
quantized  and  transmitted. 

The  following  points  can  be  made  about  the  detailed 
workings  of  the  system: 

(1)  The  samples  are  a  crude  representation  of  the 
original  image,  since  the  spacing  of  the  samples  would 
be  chosen  not  to  satisfy  the  Nyquist  criterion  but  by  a 
desire  to  represent  low-frequency  information  in  a  small 
number  of  samples.  For  example,  suppose  the  original 
image  had  a  Nyquist  imposed  resolution  of  512  X  512 
pixels.  Choosing  a  128  X  128  sampling  mask  would 
encode  the  low  frequencies  at  only  Vis  the  data  re¬ 
quirements  of  the  original  image.  Further  data  re¬ 
ductions  are  possible  because  these  low-frequency 
samples  need  not  be  quantized  at  full  resolution,  i.e., 
instead  of  8  bits/pixel,  3  or  4  bits/pixel  could  be  used. 

(2)  The  coarse  sampling  from  the  mask  results  in  a 
sampled  image  which  is  badly  aliased.  Likewise,  coarse 
quantization  of  the  optical  intensities  sampled  by  the 
mask  induces  quantization  error.  However,  both  ali¬ 
asing  and  quantization  errors  are  encoded  in  the  dif¬ 
ference  image  (along  with  image  high  frequency  infor¬ 
mation)  and  are  reintroduced  into  the  reconstructed 
image  (see  point  5). 

(3)  The  optical  interpolation  acts  upon  that  portion 
of  the  image  least  affected  by  the  aliasing  from  the 
sampling — the  low  frequencies.  A  misfocused  lens  is 
suitable  for  image  convolution  by  a  low-pass  filter.12  As 
discussed  in  the  previous  sections,  the  general  charac¬ 
teristic  of  a  prediction  filter  in  DPCM  is  a  low-pass 
characteristic.  In  the  system  of  Fig.  5  we  replace  the 
prediction  filter  of  DPCM  by  an  interpolation  filter,  Le., 
the  misfocused  lens  with  associated  aperture  apodiza- 
tion  interpolates  between  the  samples  to  fill-in  the  gape 
with  a  low-frequency  version  of  the  original  image. 

(4)  The  nature  of  the  interpolated  low-frequency 
version  depends  upon  the  interpolation  criterion 
adopted.  An  interpolation  criterion  can  be  stochastic, 
e.g.,  minimization  of  mean-square-error  can  be  used  to 
derive  a  well  known  result,  which  determines  the  in¬ 
terpolator  characteristics  in  terms  of  the  autocorrelation 
function  of  the  data.13  A  deterministic  interpolation 
criterion  can  be  specified  to  represent  exactly  a  poly¬ 
nomial  of  nth  order,  and  an  attractive  class  of  poiyno- 
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mial  interpolating  functions  for  images  are  the  b 
splines,  which  can  be  implemented  as  convolutions.14 
Of  course,  if  the  image  data  possess  an  autocorrelation 
function  which  is  identical  to  an  nth  order  polynomial, 
these  two  different  approaches  (stochastic  vs  deter¬ 
ministic  interpolation  criteria)  become  equivalent 

(5)  At  the  receiver  the  reconstruction  is  similar  to  the 
compression.  Low-frequency  samples  are  written  onto 
an  optical  modulator,  and  identical  interpolation  is 
performed  to  create  the  low-frequency  image.  The 
low-frequency  image  is  summed  with  the  difference 
image  to  recreate  an  approximation  to  the  originaL 

(6)  All  the  computations  to  carry  on  the  compression 
and  reconstruction  process  can  be  carried  on  optically 
or  electronically.  The  difference  between  the  inter¬ 
polated  image  and  the  original  image  can  be  performed 
either  electronically  (eg.,  use  two  vidicons  looking  at  the 
two  images,  with  the  synchronization  signals  slaved 
together,  and  create  the  difference  by  an  operational 
amplifier)  or  can  be  performed  electrooptically  (eg., 
using  such  electrooptic  devices  as  the  PROM  or  the 
liquid  crystal15-16).  The  only  digital  circuitry  required 
would  be  in  the  quantization  of  samples  from  the  mask 
and  from  the  difference  image.  Note  that,  since  these 
will  be  coarse  quantizations,  2-4  bits/sample,  the  as¬ 
sociated  A-D  converters  will  be  faster  in  performance 
than  A-D  converters  used  to  quantize  an  analog  scene 
at  8-10  bits  for  input  to  digital  DPCM  processing.  The 
optical  system  of  Fig.  5  should  be  able  to  operate  at 
greater  data  rates  and  be  significantly  simpler  than  an 
equivalent  digital  DPCM  device. 

This  method  of  creating  a  DPCM  image  is  referred 
to  as  interpolated  DPCM  or  IDPCM  to  distinguish  it 
from  ordinary  DPCM.  We  summarize  the  entire  pro¬ 
cess  in  the  following: 

First,  the  image  is  sampled  by  a  mask,  and  the  sam¬ 
ples  are  quantized  and  transmitted. 

Second,  the  sampled  image  is  interpolated  by  a  low- 
pass  convolution 

fiixj) "  fc(*0')V»(*0').  (16) 

where  h  is  the  PSF  of  the  interpolator,  and  f,  is  the 
sampled  image  created  in  the  first  step. 

Third,  the  difference  image  is  computed 

d(xj)  -  f(xj)  -  fdxj),  (17) 

and  all  samples  of  it  are  quantized  and  transmitted. 

Fourth,  at  the  receiver  the  mask  samples,  i,e.,  the 
samples  of  f„  are  again  interpolated  to  form  a  low-fre¬ 
quency  version. 

Fifth,  the  difference  samples  are  added  to  the  inter¬ 
polator  output  to  reconstruct  the  originaL 

A  simple  theory  of  the  operation  of  the  IDPCM  sys¬ 
tem  in  Fig.  5  can  be  readily  derived.  After  the  mask 
samples  have  been  quantized,  we  model  the  quantizer 
as  in  Sec.  II  and  have 

<?lf.(*0')l  •  f>(xj)  +  ni(xo').  U6) 

where  nt  is  the  noise  from  quantization  of  the  mask 
samples.  Likewise,  a  quantization  acts  upon  the  dif¬ 
ference  between  the  original  image  and  the  interpolated 
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mask  samples,  so  this  second  quantization  process  is 
modeled  as 

QW(* O'))  •  /(x,y)  ~  +  Mv),  (19) 

where  n2  is  the  noise  from  the  second  quantization 
process.  At  the  receiver,  the  received  quantized  mask 
samples  are  reinterpolated  and  added  to  the  received 
quantized  differences.  Thus,  the  reconstructed  image 

ia 

/r(xj')  “  +  »i(zj>)|  +  /Uo»). 

-  *(xj'),/.(*0')  +  »*(xor), 

•  f(xor)  +  h(xor)«n,Ujr)  +  flfUj')-  ISO) 

This  equation  demonstrates  that  the  reconstructed 
image  has  two  noise  contributions.  The  noise  term  n2 
is  due  to  quantization  of  differences  and  is,  hence, 
identical  in  nature  to  the  quantized  difference  noise 
seen  in  Eq.  (11).  The  other  noise  term  is  unique  to  the 
ID  PCM  process  and  is  the  convolution  of  the  interpo¬ 
lating  function  witl  the  noise  introduced  by  quantiza¬ 
tion  of  the  mask  samples.  The  noise  from  a  quantizer 
tends  to  have  a  zero  mean  (positive  and  negative  values 
both  occur4).  The  averaging  of  values  which  takes  place 
in  a  low-pass  convolution  with  the  interpolating  func¬ 
tion  results  in  the  spatial  average  of  the  mask  samples 
tending  toward  zero,  particularly  near  sharp  edges  in 
the  image.  An  area  of  the  image  where  the  underlying 
image  structure  is  predominantly  low  frequency  can  give 
rise  to  quantization  errors  in  that  spatial  region  with  a 
single  algebraic  sign  or  a  predominance  of  one  sign.  In 
such  cases  the  convolution  with  the  interpolating 
function  does  not  average  out  the  quantization  noise  of 
mask  samples  in  the  image.  The  result  is  a  visually 
noticeable  error  in  representation  of  low-frequency 
image  structure  (e.g.,  regions  of  constant  or  near-con¬ 
stant  intensity)  but  few  visually  detectable  errors 
around  sharp  edges  of  the  image.  This  latter  fact  is 
fortunate,  since  image  edge  information  is  usually  the 
most  sensitive  in  subjective  viewer  evaluations,  as  can 
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be  seen  in  the  images  presented'  in  the  following  sec¬ 
tion. 

The  important  feature  of  this  simple  analysis  of  the 
IDPCM  process  is  that  it  demonstrates  that  the  re¬ 
construction  process  does  not  grossly  amplify  the 
quantization  noise  present  in  the  data,  a  severe  problem 
with  the  nonfeedback  quantization  of  Eq.  (13).  Indeed, 
the  quantization  noise  in  the  mask  samples  is  reduced 
by  the  convolution  with  the  interpolating  function,  the 
actual  magnitude  of  the  reduction  being  dependent 
upon  the  nature  of  the  interpolating  function  chosen 
and  the  specific  region  of  the  image. 

V.  Simulations  of  the  Optical  Compression  Scheme 

A  series  of  digital  image  simulations  have  been  carried 
out  to  verify  the  validityof  the  IDPCM  process.  Figure 
6  is  an  original  image,  sampled  at  9  bits/pixel  on  a  480 
X  480  raster.  The  compression  steps  carried  out  in  the 
digital  simulation  process  were  in  the  same  sequence  as 
described  above.  The  original  image  was  subsampled, 
retaining  every  4th  pixel  of  every  4th  line  and  creating 
a  120  X  120  image  of  mask  samples.  Each  subsample 
in  this  120  X 120  array  was  quantized  to  3  bits  (8  levels) 
with  a  uniform  quantizer,  the  tMiimiim  and  minimum 
quantization  levels  being  the  maximum  and  minimum 
of  the  subsamples.  Thus,  the  low  frequencies  were 
encoded  with  120  X  120  x  3  bits  of  total  information. 
These  samples  were  transmitted. 

The  image  intensities  from  the  sample  mask  were 
interpolated  to  fill-in  the  missing  data  values.  This  was 
done  by  inserting  zeros  into  the  positions  where  pixels 
were  missing  and  then  convolving  the  resulting 480  X  480 
array  with  a  7  X  7  bilinear  interpolation  kernel17  The 
interpolated  image,  being  of  the  same  480  X  480  reso¬ 
lution  as  the  original  image,  was  now  subtracted  from 
the  original,  and  the  differences  were  quantized  at  No 
bits,  which  was  varied.  The  quantization  rule  was  a 
tapered  quantizer,  based  upon  the  Laplacian  density 
used  by  O’Neal  in  quantizing  differences  in  digital 
DPCM.7  The  low-frequency  mask  samples  and  the 
quantized  differences  constituted  the  information  that 
would  be  transmitted  in  a  real  system.  The  image  was 
reconstructed,  in  a  simulation  of  the  receiver,  by  rein¬ 
terpolating  the  low-frequency  mask  samples  from  120 
X  120  to  480  X  480  (using  the  same  bilinear  interpolator 
as  in  the  transmitter  simulation),  and  the  result  was 
added  to  the  quantized  differences  to  reconstruct  the 
image. 

Total  bits  required  in  the  simulations  can  be  calcu¬ 
lated  from  the  equation 

„  480X4S0  .  _ _ ,, 

Bwm  * — X  3 +  490X480  xNp, 

.Ifft  +  NtNo.  (21) 

IP 

where  the  original  image  is  of  size  N  by  N,  and  Np  is  the 
number  of  bits  chosen  to  encode  the  differences.  The 
number  of  bits/pixel  is 

B/pixsI-^  +  Mo.  (22) 
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Fig.  7.  IDPCM  compraaaion  >3  bita/pixel  (see  uxt). 


Fig.  8.  IDPCM  compreaaion  >2  bita/pixel  (see  Uxt). 


In  general,  if  Ni  bits  are  used  to  encode  the  low-fre¬ 
quency  mask  samples  and  if  the  mask  retains  every  Ath 
pixel  of  every  Ath  line,  the  above  equations  become 

+  (23) 

B/pixel  +  No-  (24) 

The  quantities  N  and  A  are  interpretable  in  terms 
of  the  simulation  discussed  above.  However,  even  in 
a  real  optical  implementation  of  IDPCM,  in  which  there 
is  no  image  originally  sampled  at  N  by  IV  resolution,  the 
parameters  N  and  A  are  still  meaningful.  Even  though 
an  optical  system  would  not  initially  sam'ple  the  image 
to  be  coded,  it  is  still  possible  to  express  the  information 
content  of  the  original  image  in  terms  of  an  N  by  N 
sampling.  For  example,  N  by  N  could  be  the  size  of  the 


sample  array  of  an  image  sampled  at  the  Nyquist  rate 
corresponding  to  the  incoherent  diffraction-limited 
frequency  cutoff  of  the  optical  aperture.  A  choice  of  N 
by  N  less  than  the  sampling  rate  at  the  diffraction  limit 
would  indicate  a  decision  to  allow  a  given  degree  of  ali¬ 
asing.  In  this  context,  the  parameter  N  does  not  rep¬ 
resent  the  sampling  of  an  image  such  as  used  in  the 
simulations  being  described  but  represents  the  infor¬ 
mation  requirements  of  an  image  acquisition/trans¬ 
mission  system  with  no  data  compression.  Likewise, 
in  the  context  of  a  real  optical  implementation  of 
IDPCM,  the  parameter  A  represents  the  subsampling 
of  the  nominal  Ath  pixel  of  every  Ath  line  for  the  data 
compression  processes  of  IDPCM. 

Figure  7  is  the  result  of  encoding  and  reconstruction 
for  Nq  “  3  bits/difference  sample.  The  image  is  vir¬ 
tually  identical  to  the  original  image.  Figure  8  shows 
the  same  result  for  No  m  2  bits/difference  sample. 
Some  distortion  is  visible  in  Fig.  8,  but  the  over-all 
quality  remains  high.  Figure  7  has  normalized  mean- 
square  error  of  0.3%.  For  Fig.  8  the  error  is  1.4%.  The 
normalized  mean-square-error  is  expressed  as 

E  lfr(*»y)  -  /(xjr)J* 

NMSE-— - (28) 

E  /(*0'>* 

The  NMSE  measure  is  commonly  used  to  provide  an 
error  measure  that  is  independent  of  the  mean  intensity 
level  of  the  original  image.  It  is  worth  noting  that  the 
NMSE  values  for  the  IDPCM  process  in  Table  I  are 
comparable  with  NMSE  values  in  conventional  digital 
DPCM  compression  schemes.1 

Table  I  summarizes  the  performance  of  the  IDPCM 
simulation  on  Fig.  6.  The  compression  ratio  is  the 
bits/pixel  of  the  original  image  divided  by  the  bits/pixel 
of  the  compressed  image.  Table  I  also  tabulates  the 
coding  efficiency  that  could  be  achieved  if  a  more  so¬ 
phisticated  method  of  coding  pixel  differences  is  used. 
Pixel  differences  were  found  to  be  described  accurately 
by  a  Laplacian  probability  density  such  as  used  by 
O’Neal  in  conventional  DPCM.7  Using  8-level  and 
4-level  Huffman  codes,  based  on  the  Laplacian  density, 
gives  the  coding  performance  shown  in  Table  I.  A 
Huffman  code  saves  an  additional  0.5  bit  (approxi¬ 
mately),  but  the  cost  is  in  much  greater  complexity  in 
dealing  with  the  resulting  variable  length  code  words. 

VU  Concluding  Comments 

The  simulations  presented  above  indicate  the  validity 
of  relatively  simple  optical  systems  which  can  carry  out 
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all  the  computations  necessary  to  achieve  data  com¬ 
pression  of  images.  Two  goals  are  the  direct  result  of 
this  initial  demonstration  by  digital  simulation.  First 
is  the  goal  of  constructing  an  actual  system  based  on  the 
architecture  of  Fig.  5.  This  goal  will  be  pursued 
(achievement  of  the  goal  being  dependent  upon  the 
usual  constraint,  finding  funds  to  build  the  device).  A 
second  goal  is  more  encompassing.  If  optical  compu¬ 
tations  for  image  data  compression  are  feasible,  the 
obvious  action  is  to  undertake  the  research  necessary 
to  raise  the  level  of  sophistication  for  optical  data 
compression  methods  to  that  currently  enjoyed  by  all- 
digital  processes. 

This  work  was  performed  under  the  sponsorship  of 
the  U.S.  Air  Force  Office  of  Scientific  Research  grant 
AFOSR-76-3024. 
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Analysis  of  Feedback 
Optical/Video  Systems 
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As  mentioned  in  Section  IV. 2,  fully  parrallel  DPCM  would  be 
an  ideal  component  to  process  the  correlation  in  image  data,  but 
such  structures  require  coherent  optical  feedback  and  were  dis¬ 
counted  in  previous  research  under  this  Grant.  The  basic  problem 
of  coherent  optical  feedback  is  phase  ambiguity;  optical  wave¬ 
lengths  are  very  short  compared  to  the  dimensions  of  a  feedback 
device,  and  the  signal  phase  control  must  be  equally  precise  over 
the  entire  image  plane.  We  believe  we  have  conceived  a  structure 
which  will  make  it  possible  to  combine  the  functional  flexibility 
of  optical  feedback  with  the  simplified  phase  control  of  elec¬ 
tronic  circuitry. 

We  wish  to  begin  with  a  discussion  of  the  general  nature  of 
two-dimensional  feedback  systems.  Obviously,  image  feedback,  of 
the  sort  used  in  a  DPCM  system,  involves  feedback  of  spatial  and 
temporal  information,  but  we  will  neglect  temporal  feedback  to 
introduce  the  relevant  concepts  and  then  examine  the  associated 
temporal  statility  questions  later. 

A  general  structure  for  a  two-dimensional  feedback  system  is 
seen  in  Figure  1.  H  and  G  are  two-dimensional  Fourier  transforms 
of  the  associated  point-spread-functions,  and  R  and  C  are  the 
transforms  of  the  corresponding  input  and  output.  It  is  obvious 
that  the  output  of  the  processor  can  be  written  as: 

C|u-,)  ’  1%  aSG( u?v)'H(u  ^ u ■  v)  (') 
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where  a,  8,  and  y  are  constants  that  are  either  selectable  and/or 
represent  fixed  gain  constants  of  the  loop  elements.  If  a  is  set 
to  unity  and  g(x,y)  is  taken  to  be  the  dirac  function  6(x,y), 
taking  g  =  y  the  closed  loop  transfer  function  of  the  feedback 
processor  becomes 

1 

j  +  H(u.v) 

setting  H(u,v)  =  B(u,v)  and  letting  8  become  arbitrarily  large  we 
obtain 


C(u’v)  =  STuTvT  R(u»v) 


(2) 


On  the  other  hand,  for  a  =  1,  y  =  1,  and  8  =  0,  we  have 

C(u,v)  =  G(u,v)R(u,v) 

Another  response  can  be  obtained  by  setting  a  =  1,  8  =  y, 
G(u,v)  =  B*(u,v) (where  *  indicates  the  complex  conjugate),  and 
H(U,v)  *  B(u,v)  whereby  we  obtain  the  transfer  function 

C(u,vl  a  B*(u,v) 

R^U’v5  l  +  | B(u,v) | 2 


(3) 


(4) 


which  is  recognized  as  the  Wiener  filter  if  8  =  SNR  (the  signal- 
to-noise  ratio). 

These  two  examples  have  been  presented  to  illustrate  the 
flexibility  of  a  feedback  synthesis.  Recently  Hausler  and  Lohman 
([1]  and  [2])  have  proposed  the  use  of  a  closed  loop  TV  system 
to  create  a  feedback'  processor  using  incoherent  light.  In  this 
research  negative  feedback  was  achieved  by  a  combination  of  mod¬ 
ulation  techniques  and  an  optical  summation  using  a  beam  splitter. 
While  sums  are  easily  obtained  optically,  considerable  care  and 
expertise  are  required  in  implementing  differences  with  incoherent 
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light,  as  Lohman's  example  illustrates.  It  would  seem  that  the 
subtraction  step  could  be  achieved  electronically  with  greater 
ease.  Thus,  one  way  of  implementing  the  feedback  processor  of 
Figure  1  would  be  as  in  Figure  2.  Here  r(x,y,t)  is  the  system 
input  in  the  form  of  an  electrical  signal  from  a  source  such  as  a 
TV  camera.  This  signal  is  differenced  with  the  feedback  signal 
through  the  video  mixer.  This  difference  signal  is  displayed  as 
a  CRT  image  which  is  imaged  by  a  vidicon  or  other  camera  with  a 
spatial  filter  corresponding  to  the  point-spread-function  g(x,y). 
The  spatial  filter  is  realized  by  an  apodization  coating  and  con¬ 
current  defocus  of  the  lens  131.  This  obviously  restricts  the 
transfer  function  G(u,v)  to  be  one  whose  impulse  is  everywhere 
positive.  However,  incoherent  image  "blur"  functions  have  this 
characteristic,  so  a  large  variety  of  realistic  image  processes 
can  be  implemented.  The  feedback  portion  of  the  processor  is  ob¬ 
tained  through  another  CRT  lens;  with  apodizing  function  and  TV 
camera  combination  as  shown.  The  amplifiers  with  gains  a,  3,  and 
y  are  used  to  represent  the  gains  of  various  loop  elements. 

Some  important  characteristics  of  Figure  2  can  be  summarized 
in  the  following  points: 

(1)  All  feedback  effects  operate  with  the  video 
(temporal  ,  electronic)  signal,  hence  the 
problems  of  wavelength  and  phase  references 
are  easily  dealt  with.  For  example,  using 
a  conventional  30  frames  per  second,  525 
1 i nes-per- frame  TV  system  and  assuming  tem¬ 
poral  bandwidth  along  a  scan-line  equivalent 


to  500  plus  pixels,  the  nominal  bandwidth  of 
the  video  mixing  process  would  be  7.5  MHz. 
Allowing  sufficient  additional  bandwidth  for 
horizontal  and  vertical  synch  signals,  blank¬ 
ing  pulse,  D.  C.  restoration  level,  etc., 
gives  a  10  MHz  bandwidth,  for  which  the  wave¬ 
length  is  30  meters.  Judicious  choice  of 
laboratory  set-up  should  keep  the  cabling  in 
the  loop  of  Figure  2  to  one  meter  or  less; 
hence,  any  wavelength  and  phase  problems  as¬ 
sociated  with  the  feedback  shown  in  Figure  2 
should  be  negligible. 

(2)  The  temporal  data  in  the  video  signals  are 

a  means  to  encode  x  and  y  data.  Since  the 

loop  filters  are  spatial  filters  the  result 

of  the  combination  of  the  temporal  signals 
>» 

and  the  spatial  filters  is  an  overall  spatial 
filter  with  an  input-output  response  given 
by  Equation  (1),  where  G(u,v)  and  H(u,v) 
are  the  two-dimensional  frequency  responses 
of  the  point-spread-functions  encoded  by  the 
apodizing  transparencies. 

(3)  The  system  shown  in  Figure  2  takes  a  video 
signal  as  an  input  and  yields  a  video  signal 
as  output.  Hence,  the  system  could,  at  least 
conceptually,  be  used  to  process  video  images 
in  a  real  time  environment.  Indeed  Lohman 
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has  described  this  sort  of  processor  as  an 
iterative  processor  running  at  the  TV  frame 
rate  [21 . 

(4)  The  ultimate  in  flexibility  and  interaction 

would  be  reached  by  using  a  programmable  light 
modulator  for  the  apodizing  transparency. 

Combined  with  a  servo  on  the  focus  control  of 
the  vidicon,  a  great  variety  of  optical  trans¬ 
fer  functions  could  be  rapidly  inserted. 

The  concept  of  this  feedback  processor  as  an  iterative  algorithm 
is  intriguing  in  that  it  stimulates  the  question  of  the  inter¬ 
action  between  the  temporal  nature  of  the  x  and  y  data  and  faster 
scanning  of  the  x  and  y  data.  Furthermore,  if  one  envisions  the 
processor  of  Figure  2  as  a  parallel  processor  in  the  x  and  y  di¬ 
rections  then  it  effectively  consists  of  a  large  number  (depend¬ 
ent  upon  the  spatial  resolution  of  the  TV  components)  of  temporal 
feedback  loops  in  which  time  stability  is  a  question.  In  this 
light  we  must  devote  some  research  effort  into  the  temporal  as 
well  as  the  spatial  stability  of  the  system  of  Figure  2.  In 
reality  r(x,y)  is  a  function  of  time,  r(x,y,t),  along  with  c(x,y,t), 
and  h(x,y,t).  Time  is  a  variable  in  g(x,y)  and  h(x,y)  simply 
because  of  the  finite  response  times  of  the  video  signals.  In 
addition  t  is  included  as  a  variable  in  g  and  h  to  allow  the  flex¬ 
ibility  of  tailoring  the  temporal  response  of  the  loop  to  insure 
stable  operation.  Inclusion  of  the  loop  temporal  characteristics 
produces  a  relation  between  the  output  and  input  in  the  transform 
domain  as  follows: 


C(u,v,s)  =  1  +  agYG(u!v!s Jh|u, v,s)  R(u*v*s) 
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(5) 


where 


Gfu.v.s)  •  HI  g(x,y,t)e"^2lI^ux+v^e'stdxdydt 
-00  0 


(6) 


H(u,v,s)  =  J||  h(x,y,t)e_^2lT^ux+v^e“stdxdydt 
-00  0 


(7) 


and 


R(u,v,s)  =  HI  r(x,y,t)e‘^2Tr^ux+V'y^e"stdxdydt 

-00  0 


(8) 


We  recognize  these  expressions  as  two-dimensional  Fourier  trans¬ 
forms  in  the  x  and  y  directions  with  a  Laplace  transform  along  t. 

If  we  can  make  the  assumption  that  h(x»y,t)  has  a  temporal  re¬ 
sponse  much  faster  than  f(x,y,t),  h(x,y,t)  can  be  assumed  to  be 
independent  of  t  so  that  H(u,v,s)  *  H(u,v).  Furthermore,  if  the 
TV  camera  in  the  forward  loop  can  be  chosen  so  as  to  have  a  tem¬ 
poral  response  (i.e.,  a  finite  rise  time  characteristic  of  some 
sort)  independent  of  the  spatial  coordinates  x  and  y,  then  G(u,v,s) 
*  G(u,v)  G-j  ( s) .  Here  G^s)  is  the  Laplace  transform  of  the  tem¬ 
poral  characteristic  of  the  forward  path  TV  camera.  The  input- 
output  relationship  of  Equation  (1)  becomes 

ya  G(u,v)G,(s) 

C(u,v,s)  =  1  +  oB  G1(s)G(u,v)H(u,v)  R(u,v,s) 


An  Interesting  research  question  is  the  effect  of  a  particular 
G.j(s)  upon  the  feedback  processor's  performance.  Consider  r(x,y,t) 
*  r(x,y)  for  t  >  0  so  that  R(u,v,s)  * 


In  addition  i f  we 
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1^ 

assume  that  G^(s)  =  which  would  represent  a  simple  low  pass 
filter  characteristic  in,  the  forward  camera  wev obtain 


C(u,v,s) 


^  +  ag  G(u,v)H(u,v) 


y o  G(u.v) 


?Rvu,v) 


(10) 


Applivation  of  the  final  value  theorem  for  Laplace  transforms  pro 
duces 


1  im 
£-*-00 


(u.v.t) 


yg  G(u,v) 

1  +  aB  G(u,v)H(u,v) 


R(u,v) 


(ID 


where 


C(u,v,t) 


J 


C(x,y.t)e-J2"<ux+v^)dx<iy 


02) 


It  is  clearly  seen  that  this  is  the  same  relationship  as  we  had 
before  in  Equation  (1).  Because  most  cameras  have  a  finite  time 
lag  between  the  image-induced  charge  being  stored  and  the  video 
signal  output,  some  shift  in  the  x  direction  is  anticipated.  This 
again  raises  the  question  of  the  separability  of  the  raster  scan¬ 
ning  and  finite  time  response  of  the  TV  camera.  There  is  however, 
an  extremely  interesting  way  of  circumventing  this  problem  that 
deserves  some  attention.  TV  camera  tubes  such  as  the  vidicon  or 
the  plumbicon  can  be  used  as  image  storage  cubes.  This  is  accom¬ 
plished  by  focusing  an  image  on  the  photo  cathode  while  the  scan 
is  inhibited.  The  image  induced  charge  will  be  stored  (for  a  time 
dependent  on  the  tube  dark  current)  until  the  scan  is  initiated, 
where  upon  the  image  will  be  destructively  read  out  as  a  video 
signal.  By  suitably  blanking  the  CRT's  and  inhibiting  the  scan 
of  the  TV  cameras  in  the  system  of  Figure  2,  we  can  achieve  its 
sampled  data  counterpart  as  shown  in  Figure  3. 
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Operation  of  the  feedback  processor  can  be  described  as  fol¬ 
lows:  closure  of  s-j  unblanks  the  forward  path  CRT  and  initiates 
the  scan*  of  the  feedback  path  TV  camera.  Sg  is  open  at  this  time 
which  blanks  the  feedback  CRT  and  inhibits  the  forward  path  TV 
camera  scan.  After  completion  of  the  feedback  TV  scan,  the  spa¬ 
tially  filtered  error  signal  is  stored  on  the  forward-path  TV 
camera.  s.|  is  opened  and  Sg  is  closed  unblanking  the  feedback 
CRT  (and  initiating  the  forward  TV  camera  scan).  Upon  completion 
of  this  scan,  S2  is  opened  and  the  spatially  filtered  output  sig¬ 
nal  is  stored  on  the  feedback  TV  camera.  This  process  is  then 
repeated. 

It  is  relatively  straightforward  to  show  that  this  scheme 
produces  a  sampled  data  feedback  processor  whose  block  diagram 
is  given  in  Figure  4.  The  input-output  relationship  using  a  Z 
instead  of  a  Laplace  transform  is  as  follows: 

C(u, v,z)  =  - z;fou,.v) -  R(u,v,z) 

1  +  06  Z'  G(u,v)H(u,v) 

This  produces  the  following  recursion 

C(u,v,nT)  =  G( u, v) {yaR( u ,  v, ( n - 1 ) T ) 

-  cxBH(u,v)  C(u,v(n-1  )T) } 

where 

00 

C(u,v,nT)  =  ||  C(x,y,nT)e"J’2lT^ux+vy^dxdy 
-00 

00 

R(u,v,nT)  *  ||  r(x,y,nT)e"j2Tr*ux+vy)dxdy 
-00 

♦Scan  means  single  frame  scan  since  a  single  scan  removes  almost 
all  of  the  stored  charge  in  a  plumbicon. 


(13) 

(14) 

(15) 

(16) 


Again,  assuming  that  r(x,y,t)  =  r(x,y)  for  t  >  0  we  have  R(u,v,z) 

—  ^  a  r\  A  n  f  4-  U  n  f  4  n  l  1  ii  a  1  m  a  4Laawaa  f  am  a  ,  m  a  a  a 


L,  and  application  of  the  final  value  theorem  for  z  trans¬ 


forms  gives  that 


f(u-v>'°)  =  r-rae7i(S!iiHLv)  R(u-V> 


as  before 


Thus  we  see  that  this  formulation  also  produces  the  desired 
result.  However,  the  optical  readin  and  electrical  readout  func¬ 
tions  in  the  TV  cameras  have  been  separated  by  the  sampling  func¬ 
tions  simply  by  the  fact  that  the  image  induced  charge  is  allowed 
to  accumulate  prior  to  the  initiation  of  the  scan.  This  effec¬ 
tively  eliminates  the  temporal  characteristics  of  the  x  and  y  data 
due  to  raster  scanning. 

The  utility  of  the  incoherent  optical  structures  we  are  dis¬ 
cussing  can  be  readily  applied  in  a  variety  of  ways.  The  appli¬ 
cability  to  image  filtering  Is  direct  from  the  equations  derived 
above.  The  applicability  to  DPCM  is  equally  direct.  A  DPCM 
structure  such  as  seen  in  Figure  5,  would  be  readily  Implemented 
using  the  frame-storage  techniques  discussed  in  conjunction  with 
Equations  (13)-(17)  above.  Thus,  N  parallel  DPCM  loops  could  be 
created  by  the  optical/video  hybrid  components  at  a  reasonable 
cost. 
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Abstract 


A  model  describing  the  decomposition  of  Imagery  In  the  human  retina  Is  developed  based 
on  the  retina's  cellular  structure.  Two  types  of  retinal  cells,  horizontals  and  amacrlnes, 
perform  spatial  averaging  across  the  retina  to  form  a  low-pass  Image  channel.  This  low 
spatial  frequency  Information  1$  fed  back  to  the  retina's  receptor  cells  to  form  a  differ¬ 
ence  channel  of  high-passed  spatial  frequencies.  Such  a  model  is  suggested  by  electro- 
physiological  as  wll  as  psychophysical  evidence.  Analysis  of  the  model  characterizes  the 
low-pass  channel  as  a  constrast  channel  and  the  difference  channel  as  an  edge  detection 
channel.  Application  of  the  model  to  Image  quality  assessment  suggests  a  two  factor 
approach  involving  metrics  In  the  model's  eye  domain. 

Introduction 


Because  of  Its  great  importance  to  our  perception  of  the  outside  world,  human  vision  has 
been  seriously  studied  for  centuries  and  Is  being  studied  today  by  many  scientists  In  a 
variety  of  disciplines.  Each  line  of  vision  research  Is  aided  by  the  dictates  of  Its 
discipline,  but  in  the  process  becomes  limited  by  them  as  well.  An  experimental  psycholo¬ 
gist,  for  Instance,  studies  vision  by  presenting  light  stimuli  and  recording  the  observers' 
responses,  but  he  Is  then  limited  to  input/output  descriptions  of  visual  behavior.  The 
anatomist  details  the  structures  which  comprise  our  visual  system,  but  traditionally  does 
not  study  their  function.  Function  Is  studied  by  the  physiologist,  but  resulting  descrip¬ 
tions  are  limited  by  the  enormous  complexities  Involved.  The  need  then  exists  for  a  disci¬ 
pline  which  can  handle  these  complexities  and  provide  theoretical  descriptions  to  describe 
visual  behavior  on  a  more  global  level. 

In  Its  attempt  to  provide  unifying  descriptions  of  the  complex  machines  man  builds, 
engineering  also  provides  theoretical  tools  to  describe  nature's  complexities.  Engineering 
models  which  aid  in  the  design  of  Image  processing  systems  can  be  used  to  describe  the 
human  visual  system  as  well.  Analogies  can  then  be  made  and  components  of  our  visual 
system  identified  which  do  image  forming,  sensing,  coding,  and  transmitting,  much  like 
their  physical  counterparts.  A  good  example  of  this  Is  the  vision  model  by  Charles  Hall 
and  Ernest  Hall1  as  shown  In  Figure  1.  Each  component  In  the  model  corresponds  to  a  com¬ 
ponent  or  a  process  actually  found  In  the  eye:  the  low-pass  filter  models  the  ability  of 
the  eye's  optics  to  form  an  Image  on  the  retina,  the  brightness  function  models  the  point- 
by-point  transformation  from  light  Intensities  to  their  neural  representation,  and  the  high- 
pass  filter  models  the  spatial  Interactions  resulting  from  the  cell  connections  in  the 
retina.  Such  structural  validity  Is  a  strenoth  of  the  Hall  and  Hall  model,  but  Its  retina 
model  falls  short  in  this  regard.  A  spatially  continuous  model  Is  valid  for  the  low-pass 
filter  since  a  continuous  radiometric  Image  Is  formed  upon  the  retina,  but  the  high-pass 
filter  models  the  spatially  discrete  Image  processing  which  occurs  In  the  retina's  cells. 
Moreover,  the  retinal  cells  are  fixed  in  their  functional  relationships  with  one  another, 
and  this  structure  Imposes  constraints  on  the  Image  processing  performed  there. 


Figure  1.  The  Hall  and  Hall  Model  of  Human  Vision. 

Cell  Structure  In  the  Human  Retina 

The  retina  Is  a  network  of  nerve  cells  which  are  “hard  wired"  according  to  a  particular 
organizing  scheme.  Although  some  of  the  connection  parameters  may  differ  in  different 
areas  of  the  retina  and  for  other.vertlbrate  species,  the  same  basic  scheme  pervades  the 
visual  organization  of  the  retina4.  Figure  2  is  a  schematic  representation  of  this  organ¬ 
izing  theme.  The  retina  first  senses  the  image  with  a  layer  of  receptor  cells.  Two  types 
of  receptors,  rods  and  cones,  and  three  types  of  cones  are  found.  Cones  form  our  high 


Intensity  or  photoplc  system  of  color  vision,  while  rods  form  our  night  vision  or  scotoplc 
system  whjch  is  achromatic.  Just  beyond  the  receptors  is  a  thin  layer  of  neurons  called 
horizontal  cells  which  connect  receptor  cells  together  from  neighboring  regions  of  the 
retina.  A  typical  horizontal  cell  near  the  center  of  the  retina  (In  the  fovea)  receives 
signals  from  seven  cones;  a  foveal  cone  for  Its  part  feeds  between  two  and  four  horizontal 
cells  .  The  horizontal  cell  then  sends  Its  signal  out  via  a  small  number  (approximately  4) 
of  long,  thin  arms  called  axons  to  connect  with  an  unknown  number  of  receptor  cells  in 
neighboring  regions.  No  horizontal-to-horizontal  cell  contacts  have  been  documented  in  man3 
but  they  have  been  found  in  cat  and  rabbit  retinas.  The  horizontal  cell  network  is,  there¬ 
fore,  probably  not  a  continuous  one  In  man,  but  considerable  overlapping  of  Its  branches  is 
in  evidence. 


Receptor  Cells 

Horizontal  Cells 

Bipolar  Cells 

Amacrlne  Cells 

Ganglion  Cells 
Optic  Nerve 


Figure  2.  Schematic  Diagram  of  Retinal  Cell  Connections. 

Forming  the  central  layer  of  the  retina  are  the  bipolar  cells.  With  single  Input  and 
output  arms  of  equal  length,  the  blpolars  connect  the  outer  with  the  Inner  retinal  layers, 
running  parallel  to  the  direction  of  light.  Two  types  of  blpolars  connect  with  the  cones 
in  the  fovea:  midget  and  flat  bipolar  cells.  A  midget  bipolar  normally  connects  to  one 
cone,  while  a  flat  bipolar  connects  to  about  seven  cones.  Each  cone  In  the  fovea  connects 
to  at  least  one  midget  bipolar  and  one  or  more  flat  blpolars.  The  midget  blpolars  thus 
appear  to  carry  high-acuity  Image  data  and  are  likely  color-coded  according  to  their 
attached  cones;  the  flat  blpolars  carry  small-area  averages  which  are  probably  monochromatic 
In  nature  . 

The  Inner  retinal  layer  Is  composed  of  amacrlne  and  ganglion  cells  connecting  to  the 
output  arms  of  the  bipolar  cells.  Like  horizontal  cells,  amacrlnes  connect  laterally 
across  the  retina,  but  unlike  horizontals  they  form  a  functionally  continuous  network.  The 
amacrlnes  connect  with  bipolar  axons  In  a  way  which  allows  for  feedback  to  a  bipolar  from 
neighboring  blpolars;  amacrlnes  also  connect  with  one  another  and  with  ganglion  cells. 
Approximately  10s  ganglion  cells  form  the  final  retinal  layer;  their  very  long  axons  bundle 
together  to  form  the  optic  nerve,  which  contains  all  of  the  Image  Information  to  be  used  by 
the  brain. 

The  lateral  connections  of  the  horizontal  and  amacrlne  cells  provide  the  basis  for  a  two- 
channel  decomposition  of  Images.  The  horizontal  cell  network  performs  some  degree  of  spa¬ 
tial  averaging  on  the  Image  at  the  first  level  of  neural  processing,  thus  forming  a  low 
spatial  frequency  version  of  the  image.  This  Information  Is  fed  back  to  bias  the  receptors, 
which  make  them  respond  relative  to  a  local  brightness  average.  The  midget  bipolar  cells 
then  transmit  this  high  spatial  frequency  information  with  the  local  averages  removed,  while 
the  flat  blpolars  transmit  some  form  of  the  low  spatial  frequency  Information.  In  a  similar 
manner  the  amacrlne  cell  network  provides  for  bipolar  •  bipolar  feedback,  and  additional 
spatial  processing  may  take  place  there.  Amacrlnes  appear  quite  complex  in  their  function, 
providing  for  spatial-temporal  interactions  and  for  Image  color  encoding3.  The  two-channel 
organization  of  Imagery  1$  In  evidence  at  the  bipolar  cell  level,  however,  before  the 
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intervention  of  the  amacrlnes. 

Two  spatial  channels  are  thus  transmitted  by  the  ganglion  cells  to  higher  areas  In  the 
brain.  Two  types  of  ganglion  cells  have.  In  fact,  been  observed:  sustained  and  transient 
ganglion  cells6.  The  sustained  ganglions  respond  In  a  temporally  steady  manner  to  a  con¬ 
stant  stimulus  and  appear  to  be  somewhat  more  numerous  In  the  fovea  than  In  the  periphery 
of  the  retina.  Transient  ganglions,  on  the  other  hand,  respond  primarily  to  stimulus 
changes  and  exhibit  decaying  responses  to  steady  state  stimuli.  As  far  as  spatial  fre¬ 
quencies  go,  the  transient  ganglions  have  been  experimental ly  associated  with  low  spatial 
frequencies  and  the  sustained  ganglions  with  middle  to  high  spatial  frequencies.  Thus,  the 
two  channels  which  are  formed  by  horizontal  cell  feedback  to  the  receptors  and  are  In  evi¬ 
dence  In  the  bipolar  cells,  are  separately  transmitted  by  the  ganglion  cells  to  higher 
visual  centers  In  the  brain. 

Some  recent  findings  point  to  a  two-channel  orgalnzatlon  of  Imagery  In  the  brain's  higher 
levels  as  well.  -In  recordings  from  the  foveal  striate  cortex  of  rhesus  monkeys,  Pogglo, 
Doty,  and  Talbot  report  two  types  of  spatial  frequency  responses:  band-pass  and  low-pass. 
They  observe  that  low-frequency  grating  stimuli  activate  only  particular  low-pass  neurons, 
depending  on  whether  a  contrast  border  falls  within  their  receptive  field.  A  band-pass 
neuron,  on  the  other  hand,  requires  a  number  of  repeated  edges  (as  In  a  grating)  at  an 
appropriate  spatial  frequency  before  It  will  respond,  and  at  high  frequencies  only  band-pass 
neurons  are  activated.  At  Intermediate  spatial  frequencies  of  around  4  cycles/degree, 
significant  numbers  of  both  types  of  neurons  are  activated.  All  of  the  neurons  they  ob¬ 
served  fall  Into  one  of  these  two  categories.  Indicating  a  pervasive  two-channel  organiza¬ 
tion. 


A  Two-Channel  Model  of  Spatial  Interaction  In  the  Human  Retina 


A  two- 
model  wll 
diagram. ) 
formed  on 
lable  u , j 
In  a  poin 


channel  model  of  retinal  Image  encoding  will  now  be  expressed  mathematically.  The 
1  be  two-dimensional,  discrete  In  space,  and  achromatic.  (Figure  3  gives  a  block 
A  discrete  version  of  the  continuous  light  Intensity  Image  which  Is  optically 
the  retina  constitutes  the  Input  Image  and  Is  symbolized  by  the  non-negative  var- 
The  Intenslty-to-brlghtness  mapping  occurs  first  In  the  model  and  Is  performed 
t-by-polnt  manner  by: 

x4J  •  1og(l.+ 


'U 


u1j> 


(1) 


the  model 


represents  the  brightness  Image  In  the  receptor- cell s .  n  »*  nun  amn,  j,,  ■ 

e  the  Channel  1  and  Channel  2  output  Images  In  the  optic  nerve,  the  remalndif  of 
can  be  expressed  In  two  discrete  equations  as  follows: 


.(1) 


y1jJ  *  S{1C1  l  l  hk!l’x1-k,J-C} 


y1  J  ’  ’  l  hm*n‘ x1-m, J-n  *  *  \  hk!i-x1-k.J-41) 


,(2) 


(1). 


(2) 


(3) 


There  are  two  unit  sample  responses  or  weighting  functions  Indicated  here:  h^  Is  a  broad 
weighting  function  which  models  the  relatively  large  Inhibitory  regions  created  by  the 
horizontal  cells,  while  hv*'  Is  a  narrow  weighting  function  which  models  the  smaller  ex¬ 
citatory  regions.  Both  weighting  functions  are  Gaussian  In  form  and  are  radially  symmetric 
about  the  origin;  more  will  be  said  about  this  choice  later.  The  function  S C • )  Is  a  spatial 
sampling  of  the  low-frequency  image  In  Channell  of  the  model.  This  sampling  Is  Intended  to 
model  the  likelihood  that  there  are  fewer  connections  for  the  low  frequencies  than  for  the 
high  frequencies.  The  second  channel  of  the  model.  Equation  (3),  Is  formed  by  the  subtrac¬ 
tion  of  the  low-frequency  image  of  Channel  1  from  a  high-frequency  representation  of  tge 
image.  In  modelling  the  retina  at  the  fovea  the  high-frequency  weighting  function,  hl  , 
would  have  essentially  a  one  bipolar  cell  extent,  and  Channel  2  would  reduce  to: 


y 


(2) 

1J 


r(K2[x,j  -  it  hkii-xi-it,j-t]} 


(4) 


A  second  nonlinearity  Is  Indicated  In  Channel  2  of  the  model  by  T(-);  this  Is  a  saturation 
operator  which  models  the  bipolar  cell’s  response  limitations  to  receptor  Input.  Finally, 
K1  and  ic2  are  constants  which  model  different  channel  gains. 

Equations  (1),  (2),  and  (3)  constitute  a  two-dimensional  discrete  model  of  neural  image 
formation  In  the  retina.  The  model  has  two  nonllnearltles,  namely  log  (1  *  u,,)  and  r(-i. 
Because  of  this,  It  Is  important  that  the  model  be  structurally  valid,  i.e.,  the  nonllnear¬ 
ltles  must  come  at  the  correct  places  in  the  model.  With  linear,  shift-invariant  models 
structural  validity  Is  unimportant;  the  entire  model  could.  In  fact,  be  represented  by  a 
single  transfer  function  without  altering  In  any  way  the  model's  Input/output  behavior. 
Changing  the  location  of  a  nonlinearity,  however,  changes  the  Input/output  behavior,  so  It 
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must  be  placed  carefully  for  the  proper  overall  model  behavior.  Since  the  two-channel  model 
has  been  built  with  retinal  structure  In  mind.  It  has  been  possible  to  Incorporate  the  non- 
linearities  In  their  proper  locations. 


Figure  3.  A  Two-Channel  Model  of  Spatial  Interaction  In  the  Human  Retina. 


The  Idea  of  using  weighting  functions  to  describe  retinal  behavior  via  spatial  convolu¬ 
tions  Is  not  a  new  one.  There  has  not  been  agreement,  however,  on  the  form  the  weighting 
function  should  take.  A  discussion  comparing  different  types  of  weighting  functions  can  be 
found  In  Macleod  and  Rosenfeld  .  They  prefer  Gaussian  type  weighting  functions  and  In 
particular  a  dual-Gausslan  function  of  the  form: 

w ( x )  •  exp(-2x/s)2)  -  (s/s')  exp  [-(2x/s')2]  (5) 

The  widths  of  the  excitatory  and  the  Inhibitory  regions  can  be  adjusted  separately  here  with 
the  parameters  s  and  s',  but  the  positive  and  negative  areas  under. the  curve, are  kept  equal 
to  one  another.  Use  of  single  Gaussian  weighting  functions  for  h'1'  and  hl£<  in  the  two- 
channel  model  will  result  in  a  weighting  function  In  the  form  of  Equation  (5)  for  the  high- 
frequency  channel  of  the  model  (Channel  2).  In  two  dimensions  the  weighting  functions  will 
be  radially  symmetric  (this  Is  an  approximation  since  the  retina  Is  somewhat  anisotropic)' 
and  can  then  be  expressed  as: 

hk!i2)  *  ♦  *2)/Pi,2>  *  (6) 

where  the  spread  parameter,  p.  or  p-,  uniquely  specifies  the  function.  Choice  of  the  proper 
values  for  the  model’s  parameters  has  been  accomplished  by  computer  simulation  and  compari¬ 
son  of  the  simulation  results  to  experimental  data.  For  foveal  and  near-foveal  regions, 
comparable  frequency  response  Is  obtained  with  p,  •  10  and  p,  ■  0.5,  and  a  comparable  recon¬ 
structed  Image  Is  obtained  with  K.  ■  1,  K,  ■  3,  i  Channel  1  sampling  ratio  of  16:1,  and 
Channel  2  saturation  levels  of  +,  1/6  of  the  maximum. Image  range  In  brightness.  Details  of 
this  work  can  be  found  In  the  author's  dissertation  . 

Choice  of  the  logarithm  for  the  Intenslty-to-brlghtness  nonlinearity  Is  In  dlgect  corre¬ 
spondence  with  Hall  and  Hall's  model.  Others  have  used  the  cube-root  function  ,  but  this 
choice  Is  not  directly  applicable  here  due  to  differences  In  model  structure.  Both  choices 
for  the  brightness  function  yield  sufficient  dynamic  range  compression  of  Intensity  values, 
but  the  logarithm  and  the  cube-root  functions  possess  different  non-linear  characteristics 
which  provide  the  basis  for  a  choice.  Two  non-linear  characterl stl cs  of  the  logarithm 
function  will  be  explored  In  the  following  section  on  model  analysis. 

Analysis  of  the  Two-Channel  Model 

The  logarithmic  brightness  transform  at  the  beginning  of  the  two-channel  model  alters 
model  behavior  In  a  non-linear  manner.  In  addition  to  global  compression  of  the  Input  data, 
the  logarithm  produces  two  effects  of  significance.  The  first  of  these  Is  an  automatic 
gain  control  of  small  amplitude  signals,  which  enhances  contrast  at  low  light  levels.  The 
second  effect  occurs  In  conjunction  with  a  multiplicative  model  of  Image  formation  and  leads 
to  a  characterization  of  the  model's  two  output  channels.  Mathematical  treatment  of  both 
effects  will  be  given  In  one  continuous  dimension. 


Consider  a  small  sinusoidal  modulation  of  amplitude  A  at  a  relatively  large  constant  In¬ 
tensity  level,  I,  as  the  Input: 

u(x)  *  I  ♦  Acosux.  (7) 

If  contrast  Is  defined  by: 

r'  *max  *  *m1n 


then  the  contrast  of  this  Input  Is  simply: 


•in  '  T  ' 


(9) 
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After  being  log  transformed,  the  Input  signal  becomes: 

f ( x )  ■  1ogb(l  ♦  I  +  Acosaix)  , 

which  can  be  expanded  In  a  Taylor  series  about  (I  ♦  1)  as  follows: 


f(x) 


m  Iin(1  *  13 


.  A • cosuw 

*  r  v  ; 


2  2 

Afccos  am  ^ 

2(lw.I)£ 


..1 


(10) 

(11) 


Since  I  Is  assumed  large  and  A  small,  the  first  two  terms  constitute  a  good  approximation 
to  the  output  waveform: 


f(x)  '  ZnF  Un{1  *  13  *  Ai"°S^|Xl 

The  contrast  derived  from  this  approximation,  now.  Is: 


(12) 


A  P 

TTT  C1n 

Cout  "  M1~  T)  ZnTTT 


(13) 


a  result  which  Is  Independent  of  the  original  log  transform's  base,  b.  In  comparison  to  the 
Input  contrast,  tbe  output  contrast  has  been  approximately  reduced  (with  I  large)  by  a 
factor  of  (In  I).  This  means  that  the  small-amplitude  modulation  of  A.  at  an  average  In¬ 
tensity  level  of  I.  must  be  larger  by  a  factor  of  £n(I.  -  I.)  In  order  to  yield  the  same 
output  contrast  as*a  modulation  of  A.  at  a  lower  average  intensity  level,  I.,  after  both 
have  been  log  transformed.  In  termsuof  Input  parameters,  this  condition  can  be  expressed 


CX  :  indx  -  I0)C0 


(1A) 


which  relates  Input  contrasts  that  are  equivalent  after  log  transformation. 


A  logarithmic  Intensity/brightness  transform  provides  global  compression  of  the  Input 
Image's  dynamic  range  and  boosts  small  signals  at  low  average  Intensities  relative  to  those 
at  high  average  Intensities.  When  acting  In  conjunction  with  the  model's  two-channel  linear 
filter,  the  log  transform  has  another  Interesting  feature,  namely  the  separation  of  object 
contrast  from  scene  contrast.  Analytical  treatment  of  this  effect  requires  a  specific 
model  of  scene  formation,  but  results  of  the  analysis  characterise  Channel  1  of  the  model  as 
a  contrast  channel  and  Channel  2  as  an  edge  detection  channel. 


A  basic  assumption  must  first  be  made  concerning  the  formation  of  a  scene's  radiance 
pattern.  That  Is,  the  radiance  from  a  scene  Is  the  product  of  two  components ,  an  Illumin¬ 
ation  component  and  reflectance  component.  One-dlmenslonally  this  can  be  expressed  by: 

u(x)  •  ur(x)-u1(x)  ,  (15) 


with  the  constraints: 
and: 


0  <  Uf  <  • 


0  < 


V.mln  iV  ur. 


max 


1. 


(16) 


Stockham1  has  argued  for  the  validity  of  this  model,  and  while  setting  u  «  equal  to 
.005,  points  out  that  .01  Is  a  likely  minimum  In  virtually  all  situations.  "The  important 
Information  In  a  scene  Is  that  which  tells  us  something  about  the  objects  In  the  scene;  this 


Information  Is  contained  In  the  reflectance  component  of  the  radiation,  u  .  Varying  Illu¬ 
mination  across  the  scene  combines  with  the  object  Information  In  a  multiplicative  manner. 


making  object  detection  difficult  for  a  linear  detector  of  radiation.  An  ideal  object  de¬ 
tector  must  somehow  separate  the  reflectance  from  the  Illumination  component  In  order  to 
’see''  the  objects  In  a  scene. 


The  two  channels  produced  by  the  retina  are  able  to  seperate  the  reflectance  and  Illumin¬ 
ation  components  If  It  can  be  additionally  assumed  that  the  Illumination  component  Is  of 
lower  spatial  frequency  content  than  the  reflectance  component.  An  analytical  example  will 
suffice  to  show  this.  Let, 

up(x)  •  .5  *  ( .«9)mpcosa>rx  ;  0  <  ip  <  1,  (18) 


be  the  reflectance  of  an  object  of  spatial  frequency  »  and  modulation  m  .  The  minimum  re¬ 
flectance  possible  here  Is  .01  and  the  maximum  Is  .99.  Using  the  modulation  definition  of 
contrast  given  In  Equation  (8),  the  contrast  of  up  Is: 


(19) 


Now  lot 
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Uj(x)  ■  I  +  Acoso^x  ,  I  »  A  ,  (20) 

be  the  scene  illumination  with  a  OC  level  of  I  and  an  amplitude  of  A;  the  contrast  of  u^  is: 

C,  -  j  ■  (21) 

The  following  signal  1$  thus  taken  as  the  input  to  the  two-channel  retina  model: 

u(x)  ■  (.5  *  ( .49)mpcoswpx] • [I  +  Acosai^x).  (22) 

Let  f(x)  be  the  log  transformed  signal: 

f(x)  »  log10(l  ♦  u(x)) ,  (23) 

which  can  be  expanded  In  a  Taylor  series  about  u(x): 


T(x)  »  1og10(u(x)J 


*  tn 16  uTx 


1  .  _1  «.  _ l  . 

“T*T  2*.uZ(x)  31u3(x) 


Since  the  photopic  region  of  vision  is  being  modelled,  u(x)  is  large  and  the  following  ap¬ 
proximation  can  be  made: 


Substituting  in  for  u(x)  gives, 


f(x)  «  f ( x)  ■  1og,n(u(x)] 


f(x)  •  logjQlUpCxJ-u^x)!  , 


f(x)  ■  log10(ur(x)l  ♦  log^u^x)  J  .  (27) 

With  the  input  signal  of  Equation  (22)  this  becomes: 

f ( x )  •  logl0(.S  ♦  (.49)mrcosa»rx]  +  log10[t  ♦  Acosa^x]  .  (28) 

»  This  signal  next  enters  the  linear  portion  of  the  model  and  produces  a  Channel  1  output, 
y,(x),  and  a  Channel  2  output,  y,(x).  In  order  to  find  expressions  for  these  two  outputs 
in  terms  of  their  intervening  transfer  functions,  the  input,  f(x),  must  be  expressed  as  a 
sum  of  pure  sinusoidal  components.  The  second  term  of  (28)  can  readily  be  approximated  by 
the  first  two  terms  of  its  Taylor's  expansion  since  I  Is  much  larger  than  A  in  the  photopic 
region  of  vision.  This  gives: 

.  Acosai.x 

logl{J[I  ♦  Aeos«,x)  -  log13(I)  ♦  •  (29) 

More  terms  must  be  taken  for  an  equivalent  Taylor's  approximation  of  the  first  term  of  (28): 
logiQ(.S  +  ( .49)mrcoscurx)  •  logl(J(.5J  ♦  ( .426)mpcosa>rx 

-  ( .209)mp2cos2<vpx  *  ( .068)mp3cos3upx  -  ( .017)mr4cos4«px  (30) 

Trigonometric  substitutions  are  made  to  eliminate  the  powered  cosine  functions,  yielding: 
log10(.5  ♦  (.49)mpcos«ipxl  •  (-  .301  -  .  104«p2  -  .006mp4)  ♦  (,426mp  *  . 051mp3) coswpx 

-  (104mp2  ♦  .008mp4) cos2wpx  +  .017*p3cos3»px  -  .002mp4cos4#>px  (31) 

The  five  terms  here  Include  a  OC  term,  a  fundamental  frequency  term,  and  its  first  three 
harmonics,  with  each  term  a  function  of  the  reflectance  modulation,  m  .  With  the  addition 
of  the  two  terms  of  Equation  (29),  a  completed  approximation  of  ?(x)  in  terms  of  distinct 
spectral  components  Is  finally  given  by: 

f(x)  •  (log10(I)  -  .301  -  .  104mp2  -  .006mp4]  ♦  ('.^n(  ifl ycos o>^ x  ♦  (.426mp  ♦. 051mp3) cos 2aipx 

9  A  9  a 

-  [.104mp  +  . 008mp  ]cos2upx  *  . 017mp  cos3wpx  -  ,002mp  cos4«px  (32) 


The  log  transform  thus  reduces  the  global  intensity  of  the  image  (first  term  in  (32)), 


«.  - 
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attenuates  tha  1 nomination  component  (second  term)  by  (I-tn(lO)l*1,  and  changes  tha  re¬ 
flectance  coaponant  from  a  multiplicative  to  an  additive  coaponant  (third  tara) ,  while 
also  Introducing  harmonics  (last  thraa  'taras). 

A 

Tha  approximated  brightness  signal,  f(x).  Is  next  Input  to  tha  aye  model's  linear  fITter 
which  produces  a  low-pass  output,  y,(x),  and  a  high-  or  band-pass  output,  y?(x).  Let  6. (<u) 
and  Q,(<u)  be  tha  aodulatlon  transfer  functions  of  Channels  1  and  2,  respectively.  If  It  Is 
assumed  that  G,(»)  cuts  off  by  u  with  Increasing  <■>  and  that  6.(ai)  cuts  off  by  u,  with  de¬ 
creasing  a>,  thdn  tha  outputs  can  be  expressed  by:  2  1 

y^(«)  •  G^(0)  (1og^(  I )  -  .301  -  .104mp3  •  .006mp4]  ♦  ®l^w1  ^TTnTTBT  C0*4li*I  •  (33) 

and, 

y2U)  •  S2Ur)(.426ar  ♦  .05Up3]co«mp*  -  6j( 2»p) I . 104mp2  ♦  .008mp4)cos2«px 

♦  G2(3wp)  (.017ap3)eos3«px  -  G2(4wp)  (.002mp4]  cos4*ipx  (34) 

An  analytical  conclusion  can  be  aade  froa  Equations  (33)  and  (34).  Naaely,  the  two- 
channel  retina  aodel  acts  to  separate  Illumination  Intensity  and  contrast  Inforaatlon  Into 
one  channel  and  object  reflectance  Inforaatlon  Into  the  second  channel.  This  Is  a  desirable 
feature  of  a  detector  of  objects  to  have,  and  It  Is  apparently  a  design  feature  of  our 
retinas . 


Conclusions:  Implications  of  the  Two-Channel  Model 

The  two-channel  model  describes  how  and  suggests  why  the  retina  decomposes  an  laage  at 
the  first  levels  of  neural  representation.  The  low-frequency  channel  allows  the  eye  to 
represent  a  relatively  large  lualnance  range  slaultaneously  (approximately  three  orders  of 
magnitude).  It  also  acts  to  bias  the  high-frequency  channel,  allowing  the  edge  (high- 
acuity)  data  to  be  represented  easily  within  the  dynaalc  range  of  the  bipolar  and  ganglion 
cells.  The  edge  Information  gets  a  high  priority  In  neural  coding  teras;  more  neural  coding 
power  goes  to  represent  the  high-frequency  Image  compared  to  the  low-frequency  one.  This 
Implys  that  loss  of  image  acuity  and  the  edge  Information  associated  with  It  Is  viewed  aorn 
critically  by  the  retina  than  loss  of  global  Image  contrast,  which  is  represented  aalnly  by 
the  low-frequency  channel.  This  has  been  the  case  In  Image  quality  assessment  studies  where 
aculty.is  consistently  chosen  as  the  primary  quality  factor  with  contrast  the  secondary 
factor12. 

An  assessment  of  Image  quality  can  be  made  via  the  two-channel  model.  Imege  quality 
would  be  expressed  with  two  factors:  edge  quality  or  degree  of  acuity  and  contrast  quality. 
Experiments  could  be  performed  In  which  an  original  Image  Is  degraded  by  various  combina¬ 
tions  of  blur  and  contrast  to  yield  a  set  of  test  Images.  Quality  measures  could  be  com¬ 
puted  for  these  test  Images  by  using  a  difference  metric  between  their  eye  domain  versions 
and  an  eye  domain  version  of  the  original.  Each  test  Image  could  then  have  a  pair  of 
global  quality  measures  for  a  given  setting  of  the  eye  aodel  (say,  foveal)  and  a  local  pair 
for  any  desired  subregion  of  the  image.  A  factor  analysis  and  a  step-wise  regression 
analysis  could  be  performed  between  these  metrics  and  results  from  a  viewing  experiment. 

The  resultant  would  be  a  multiple  regression  equation  predicting  subjective  image  quality 
as  a  function  of  computable  Image  metrics. 

The  two-channel  model  also  Implies  that  edge  Information  Is  of  fundamental  Importance 
to  subsequent  levels  of  neural  processing  In  the  brain.  All  of  the  Image  Information  must 
pass  through  the  bipolar  cell  array,  at  which  point  the  two  image  channels  are  already  In 
evidence.  The  Inner  synaptic  layer  of  the  retina  then  receives  these  signals  from  the  bl- 
polars  as  Input.  This  layer  thus  deals  with  edge  Images  and  contrast  Images,  and  any  aodel 
at  this  level  (a  color  coding  model,  for  Instance)  must  assume  this.  That  edge  Information 
Is  part  of  the  "language”  of  the  retina  has  been  shown  experimentally.  In  a  recent  Scien¬ 
tific  American  article  J  experiments  are  described  which  show  that  If  an  edge  separating  an 
.Inner  from  an  outer  circle  of  two  colors  Is  eliminated  by  stablllxalng  the  Image  on  the 
retina  then  the  color  difference  disappears  too.  Thus,  the  edges  or  borders  in  a  scene  are 
extracted  from  the  Image  In  the  first  layer  of  the  retina  and  are  basic  to  all  of  our  visual 
perceptions  of  the  world. 
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Aba  tract 

Bus  paper  presents  an  image  aoding  algorithm  using  spline  factions  that  is  coapetitive  with  the  more 
conventional  orthpgnnal  transform  methods  at  data  rates  of  1  bit/pixel  or  less.  Spline  ooding  has  the  added 
attraction  of  an  optical  implementation  arising  from  the  fact  that  least  squares  image  approximations  also 
produces  lsast  squares  approximations  to  the  image  derivatives.  A  first  order  spline  is  used  to  approximate 
the  proper  order  derivative  of  the  image  whose  order  is  determined  by  an  analysis  presented  in  the  paper. 

B *  image  derivative  is  then  encoded  and  txanmitted  to  the  user  who  reconstructs  the  image  by  a  k-1  order 
integration  which  can  be  done  optically. 

introduction 

This  paper  is  oonaemed  with  the  development  of  the  concepts  of  image  degrees  of  freedom  and  entropy  from 
an  approximation  theoretic  viewpoint  for  application  to  image  aoding.  Bwse  concepts  are  used  to  develop  a 
ooding  method  using  spline  functions  that  can  be  implemented  using  optical  processing  techniques. 

Treating  the  degrees  of  freedom  of  an  image  as  approximation  problem  arises  quite  naturally  in  the  con¬ 
text  of  image  ending  by  transform  methods,  where  an  orthogonal  transformation  is  performed  on  a  stapled 
image  matrix.  A  bandwidth  reduction  is  obtained  by  trananitting  only  those  transform  coefficients  above  a 
certain  threshold  whose  level  is  consistent  with  the  desired  error  [1] . 

In  this  sense,  the  degrees  of  fieedan  of  the  image  at  an  error  of  magnitude  epsilon,  or  more  succinctly, 
the  epsilon  degrees  of  freedom  in  terms  of  tha  orthogonal  fimctions  used.  DOF(c,«) ,  is  sinply  the  nuzber  of 
functions  in  the  set  {*}  required  to  achieve  the  desired  error,  e.  The  overall  data  rate  R(«,e)  is  the  pro¬ 
duct  of  the  muter  of  bits,  N_,  required  to  adequately  represent  the  coefficients  and  the  lumber  of  coef¬ 
ficients,  D0F(c,A)  and  is  givifc  by: 


R(8,c)  -  Hp  •  DOF(c,$) 


Implicit  in  this  is  the  assimption  that  the  overall  aoding  procedure  can  be  separated  into  two  parts: 
first,  obtaining  an  adequate  transform  approximation  of  the  image,  and  second,  tha  quantization  of  the  trans¬ 
form  coefficients.  By  approaching  the  ooding  problem  in  this  msnner  it  baocmes  easier  to  inderstand  one 
difficulty  with  the  orthogonal  transform  aoding  methods.  The  large  bandwidth  reductions  reported  are  due, 
in  pert  bo  the  ocnpecting-of-imaga  energy  property  of  orthogonal  transformations.  The  difficulty  in  quanti- 
zation  of  tha  coefficients  is  the  result  of  the  fact  that  any  aoepacting  in  the  transform  domain  is  at  the 
expense  of  an  increased  dynamic  range  in  the  transform  coefficients,  because  of  the  conservation  of  anergy 
inherent  to  all  orthtxjnnal  transformations. 

This  would  seem  to  indicate  that  a  suitable  set  of  transform  functions  would  possess  both  good  approxi¬ 
mating  properties,  and  transform  coefficients  whose  dynamic  range  is  of  the  order  of  that  of  the  original 
image  pixels.  This  idea  could  be  extended  to  finding  the  best  set  {«}  minimizing  R(«,c)  i.e. , 


adn  R(*,c)  -  aintNfDOrU.c))  •  R(e)  -  HjPDf(c) 


Again  the  assunption  is  made  that  the  quantiaticn  and  approximation  stape  can  be  separated. 

Blue,  one  method  of  finding  the  apeilon-Je^.  eaa-of-fraadam,  or  the  minimis  data  ratm,  would  be  to  find  the 
set  of  functions  which,  when  imad  to  approximate  the  image  at  an  error  rats  epsilon,  would  require  the 
fewest  muber  of  functions.  This  is  a  very  difficult  problms  so  tha  results  in  this  work  will  be  zvpCfMntftd 
using  let*1  order  splines.  Splines  are  thoeen  due  to  their  excel  lent  image  approximation  properties  (2) ,  their 
desirable  computation  characteristics  and  ths  feasibility  of  an  optical  imp  lamentation. 


Methods  (Least 


aline  Methods) 


vtiils  ths  determination  at  each  step  of  such  a  bast  approximating  spline  is  aisply  a  nonlinear  minimiza¬ 
tion  problen  over  the  toots*  defining  ths  spline,  it  is  ocrputaticnally  infeasible.  Thus  we  nust  follow 
DeBoer  [3]  and  settle  for  spline  approximations  with  good,  if  not  optimal,  toot  placmnts.  In  what  follows, 
an  easily  isp lamented  toot  plaomrt  method  will  be  given  that  can  result  in  a  significant  error  reduction 
over  the  ini  form  toot  case.  The  results  will  be  developed  using  splines,  giving  ths  following  It0*  order 
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splint  approximation  f  (x,y) : 


UX,V)  “  M  j^Si3NiA(^X)Nj.JclH;y»  "  Sk.Hje,Ny(ac'y) 


m  N  .  (^_;x)  arc  the  normalized  B-*plines  of  order  k  (degree  k-1)  described  by  DeBoor  [4] ,  and  £  and  n 
theTfflot  vectors  in  tbs  x  and  y  direction  respectively.  The  spline  coefficients,  are  obtained  Ey 


solving  the  following  system  of  equations: 

£(W  ■  £  X si jHi ,k j ,k ,2',yB> 


Hx  Ny 


for  l-l ,2, _ N  and  m-1,2, . .  -N.  In  matrix  notation  Otis  become 

lf(XfV  W^X.k^V 

T 

where  [  ]  indicates  matrix  transpose.  10  simplify  notation  let 

tftx^n  -  [FI 

W1  (~“  ^i.k^* 
fNj  iNjyt^yJ]- 

Equation  (2)  beaones 

T  T 

[F]  -  [Ml <£)  lSi;j]  IN] 


since  N>N  and  N>N I  ,  in  general,  equation  (2)  cannot  be  solved  exactly.  However,  the  spline  coefficients 
that  mini&ze  the  normalized  least-squares  error  e,  given  by  the  expression: 


1  : 

i-1  a>«l _ 

"  £  |£<xt,y J2 

i-1  m-1  1  ® 


can  be  obtained  by  taking  (S.  . ]  to  the 


T  T  T 

[sij]-HN]<i)  IN](^  ]_1rN]<i>lF][NJ(n)  I [NJ  (n)N<n>  ]-1 


The  reminder  of  this  subeacticn  is  concerned  with  the  possibility  of  subescticning  the  image  and  using 
different  knot  densities  in  each  of  the  subsections,  and  with  the  quantisation  of  the  spline  coefficients. 
It  is  reasonable  that  subeecticning  might  provide  fruitful  results,  whan  one  considers  an  I3  error  bond 
given  by  Schultz  [5]  for  k®  order  splines.  Recalling  that  the  error  is  given  by  the  14  norm,  1 1 .  1 12  of 
the  difference  between  the  taction  and  its  approximation,  tills  bound  is  givn  by 
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+  I  la- jc  f(x,y>  n 2 


(«) 


where  p  »  max  {max  (5^-^) ,  max  (n^^'riy)  > 

C  »  0(4) 

k  «  e\*n  integer 

til 

Thus  if  the  image  derivative  energy  is  Large  only  over  a  snail  region,  then  using  a  uniform  knot  k  order 
spline  with  knot  width  equalling  o  as  indicated  by  equation  (4)  should  result  in  an  overly  good  approxinwtim 
of  the  image  in  those  regions  where  the  image  derivative  energy  is  low.  Thus  we  should  be  able  to  obtain 
reasonable  results  by  employing  a  different  k®  order  spline  with  uniformly  spaced  knots  in  each  subsection 
if  the  knot  density  in  each  subsection  is  proportional  to  the  value  of 

,k  ,k/2  ak/2  ,k 

I  iflt  f<x'y>  1 12  ♦  I  \\l?i  -  ITT!"  *(x-y>  1 12  +  1 ISTe* (x'y>  1 12 


in  that  subsection. 

After  placement  of  knots  and  solution  for  the  least-squares  coefficients,  the  spline  coefficients  are 
uniformly  quantized  on  a  subsection  by  subsection  basis.  A  uniform  quantizer  was  chosen  due  to  a  lack  of  a 
better  understanding  of  the  coefficient  statistics  at  this  time.  The  nixber  of  coefficient  quantization 
levels  in  each  subsection  was  proportional  to  the  variance  of  the  spline  coefficients  in  that  subsection, 
with  the  maximun  number  of  levels’  ctovtn  for  the  subsectim  with  the  highest  pixel  variance.  The  propor¬ 
tionality  constant  which  determines  the  neither  of  quantization  levels  in  each  subsection  is  chosen  to  achieve 
the  overall  desired  bit/pi»l  rat* 

Since  this  is  an  adaptive  quantization  algorithm  some  overhead  is  necessary .  The  total  ranter  of  bits 
required  for  transmission,  N_,  il  related  to  the  total  muter  of  overhead  bits,  UQ,  as  follows  and  the 
ranter  of  coefficient  hits  as  follcws: 


NT  *  NR  +  V 

N  can  be  determined  by  consideration  of  the  fact  that  the  overhead  consists  of  the  bits  required  to  describe 
t8e  subsection  quantizers,  N I,  the  maxima  possible  muter  of  coefficients  per  subsection.  N  ,  and  the  mexi- 
aun  possible  nuiter  of  bits  per  coefficient,  N.  ,  in  each  subsection.  Thus  if  is  the  number  of  subsections 
the  nutter  of  overhead  bits  is  given  by 


No“N,tVNc*Nb)- 


The  nutter  of  bits  required  to  describe  the  coefficients,  N_,  is  sinply  the  sun  over  all  the  subsections 

of  the  nutter  of  bits  required  to  describe  the  sxteection  splint  coefficients. 

Description  of  the  subsection  quantizers  requires  the  maxima  and  minima  reconstruction  levels.  These 
are  quantized  to  32  bits  each  to  ensure  sufficient  accuracy  so  that  N  “  2  x  32  “  64.  8  is  taken  arbitrarily 
to  be  64  so  that  a  tbtfJ  image  will  have  a  maximun  of  32  lasts  in  the  x^and  y  directions  if  N  •  2S6.  Since  the 
maximun  possible  nuiter  of  coefficients  N  “  (32)  ,  S  bits  are  sufficient  to  describe  the  muter  of  lasts, 

or  equivalently,  the  nuiter  of  coefficients.  The  maximum  of  quantization  levels  is  taken  to  be  32  so  that 

«  5,  These  values  are  sumerized  in  Table  1. 

Experimental  Results  (Least  Squares  Spline) 

TO  danonstrate  the  utility  of  using  splines  for  image  aoding,  an  experiment  was  performed  on  the  256x256 
pixel  image  shown  in  Figure  1.  The  image  was  partitioned  into  64  subsections  and  approximated  by  second  order 
splines.  The  unquantized  spline  approximation  to  the  image  is  shown  in  Figure  2.  The  spline  coefficients 
ware  then  quantized  at  a  rate  of  approximately  1  bit/pixel,  including  the  overhead,  and  thai  used  to  produce 
the  quantized  image  in  Figure  3.  The  corresponding  errors  are  shavn  in  Table  2.  Note  that  tha  quantiza¬ 
tion  step  at  this  bit  rate  has  not  introduced  an  excessive  error  increase  over  tha  inquantizad  spline 
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approximation.  Atl.Ql  bit/pixel  an  error  lass  than  .5*  is  quits  reasonable  considering  the  non -optimal  use 
of  a  uniform  quantiser.  A  max  quantiser  [6]  snploying  the  proper  statistical  properties  would  Host  likely 
produce  better  results.  Nevertheless,  the  visual  qualities  of  the  quantized  reconstruction  in  Figure  3  are 
quite  good  and  deracnsttate  that  splines  are  a  feasible  approach  to  the  image  coding  problem. 

Optical  Inpleraentation  (Derivative  Spline) 

•a*  possibility  of  optically  inplementing  the  spline  ooding  algorithms  of  the  previous  section  is  be  ted 
an  facts  that:  a  least-squares  k®  order  spline  approximation  to  an  imege  produces  a  least-squares  approxi¬ 
mation  to  its  derivatives  up  to  order  k-1,  in  terms  of  lower  order  splines  and  the  divided  differences  of 
the  spline  coefficients;  and  that  the  k-1  derivative  of  e  kth  order  spline  is  a  first  order  spline  of  the 

fm  ♦ 


Mx  Ny 

sm  N  (**y>  *  Z  t  Cjj  N.  ,  <£sx>  N .  ,  (my) 
1,Nx'Ny  i-1  j-l  13  i'1'  *  3'1 


11  if  x  c  Ci+1) 
<£;x)  -/  . 

1 0  otherwise 

fl  if  y  c  (m.  ni+1) 

n,  .  (n;y)  33+1 

LO  otherwise 


if  y  c  (ny  nj+1) 
otherwise 


An  inderstanding  of  the  process,  involved  in  Obtaining  a  least-squares  1st  order  spline  approximation  of 
the  k-1  derivative  of  f(x,y),  D  *  D  ic“if(x,y),  can  be  gained  by  consideration  of  figure  (4).  This  figure 
shows  the  domain  of  definition  of  a  particular  subsection  of  D i  W  D »„**’*? (x.y) ,  along  with  the  knots 
defining  S-  „  (x,y) .  Since  the  least-squares  approximation  Of  D  D  ^^f  (x.y)  by  a  constant  in  the 
rectangle  A '  x  y  [L *tH.1)x(m,m.1)  is  obtained  by  setting  the  caoflstantrequal  to  the  average  value  of  D '  k"1 
Ey)t*1f(x,y)  in  that  reCtingle?  C?*is  given  by 


Cij  ’  <5^) 


ci+l  Vl 

u 


Djlk_1Dy,'-1f(x,y)dxay 


with  the  estimate  being 


.  _  sx  Mv 

D  lt~aD  k*1f  (x.y)  -  Z  Z  j- — —A 

*  y  i-1  -i-1  << 


i-1  j-l  '"j+1  "j'  '■•i+l  'i 


V1  TJ?fl  D^Dy^f  (a.fiJdadB 


?  /J. 


X  Nifl(i.x)  Nj  x  (_.y) 


f  (x.y)  is  then  obtained  by  a  k-1  Sold  integration  of  equation  (7)  in  the  x  end  y  directions  with  inclusim 
of  the  proper  initial  conditions.  Since  initial  conditions  in  the  higher  order  derivatives  produce  simple 
mononlal  changes  in  density  across  the  face  of  the  image  (a  situation  which  is  vxvlikely  to  oacur  in  practice) 
these  are  assured  to  be  zero.  Thus  only  the  coefficient  matrix  C  and  the  imege  initial  conditions  need  be 
quantized  and  transmitted  to  the  user.  The  image  is  reconstructed  by  using  an  idealized  coherent  processor 
of  figure  (5) .  Here  the  approximation  to  D  K_lp  (x.y)  is  the  input.  A  filter  whose  transfer  function 
corresponds  to  that  of  a  k-1  order  integrator  in' the  x  and  y  directions  is  placed  in  tha  back  focal  plane 
of  Leva  l«.  Thus  tha  output  in  P,  is  the  k**1  order  spline  approximation  to  f  (x.y)  minus  the  initial  aondi- 
ticns.  It  should  be  noted  that  ait  actual  implementation  would  invasive  the  use  of  *laaty*  integrators  since 
the  transfer  function  of  an  ideal  integrator  is  urraalizable.  However  it  is  felt  that  this  would  not  seri¬ 
ously  ffact  the  performance  of  the  tystro.  The  initial  condition  estimates  r(x.-l)  and  f(-l,y)  are  intro¬ 
duced  with  a  beam  splitter  so  that  the  final  reconstructed  image  Sk  „  N  (x.y)  appears  in  the  output  plane 


*  See  tha  Appendix. 
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of  Lens  Lj. 


Heretofore ,  the  analysis  has  been  idealized  and  sinplistic  in  the  seise  that  negative  values  of 
S.  (x,y)  exist  and  present  problens  to  imaging  devices  that  are  intensity  sensitive.  This  does  not 
presfisc  an  insunromtable  difficulty  in  the  reconstruction  process  since  two  processors  can  be  unplammntad; 
one  for  the  positive  portions  of  5,  „  „  (x,y)  and  one  for  the  oarresponding  negative  portions.  Since  f  (x.y) 
is  dways  greater  than  zero  the  '"x»"y  initial  conditions  need  be  included  only  in  the  pzooesear  for  the 
positive  portion.  Labeling  the  appropriate  portions. 


Sl'»x.Ny(X'*)'  SjC.^Ny****' 


^  Nx  ^(x.y>,  ^dS^^Cx.y): 


each  can  be  obtained  with  a  dial  processor  that  includes  a  subs traction  step  as  shewn  schematically  in 
figure  (6) .  The  subtraction  step  can  be  inplanentad  either  optically  or  electronically  before  a  final 
image  display  step. 

The  difficulties  with  negative  values  of  S1(Nx<t.  (x.y)  are  not  so  ouch  with  the  reconstruction  step, 
since  a  coherent  processor  can  handle  negative' as* well  as  positive  values,  but  with  the  determination  of 
Sj^  m  «(x,y)  itself.  D^ok'lfte.y)  can  be  obtained  optically  but  detecting  its  negative  values  with  an 
intensity  sensitive  detector  requires  holographic  recording  techniques,  this  represents  an  unnecessary 
duplication  if  this  can  be  avoided  by  sane  other  optical  or  hybrid  processing  technique.  One  such  tech¬ 
nique  would  involve  imaging  f  (x,yU  with  an  NxN  003  camera  whose  output  was  the  k**1  order  divided  difference 
Of  the  pixel  matrix  F,  vk-ljfyk-lj1  .  Here 


-110 - 0 

0-1  1  0  -  -  -  0 

0  -  -  -  0-1  1  0 

0 - 0-11 


and  F  •  [f  (x^y^)  I 

Vlt~1F[V,c~1]T  is  then  averaged  down  to  produce  C  by  a  microprocessor  or  a  hard-wired  algorithm.  This  is 
shown  diagrams tically  in  figure  (7) .  The  quantiser  is  shown  incorporated  in  the  averaging  processor  so  that 
its  output  is  the  quantized  version  of  C,  C  .  The  averaging  rate  is  determined  by  the  derivative  energy 
processor.  The  rate  information  can  be  obtained  either  arithmetically  from  [vk-l]T  ^  optically 

from  f  (x.y)  -  hence  the  tso  iiputs  shown.  The  output  is  the  dimension  of  the  C  netrix  or  an  equivalent 
quantity.  CL  is_shown  as  the  oitput  of  the  quantizer,  where  it  is  then  split  into  two  parts  C*  and  C  ~ 
where  C  +  and  C  ”  are  given  by  **  ** 


Cq+  -  [max  (0,c{^)], 

C  ”  -  (|min  (0,C^)  |J. 


Thue  C_  and  C_ 


are  aetriaea  consisting  of  non- negative  elements  such  that 


C  -  C_ 

q  q 


Since  c*  and  C  ~  are  sufficient  for  the  ueer  to  generate  S  * 

q  q 


aBd 

*  X  * 


are  trahmnitted  to  the  ueer.  Also  tranmnittad  are  the  quantized  versions  of  the  initial  conditions  fq<*L-: 

^(-i.yy  •  * 

Experinental  Results  (Derivative  Spline) 

TO  demonstrate  the  feasibility  of  the  hybrid  spline  encoding  system  a  simulation  was  performed,  and  these 
results  were  compared  with  those  of  the  least  squares  spline  at  a  data  rate  of  approximately  2  hits/ptxal. 
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The  optically  gereratad  splint  results  are  shown  in  figures  Band  9.  Figure  8  shows  the  \mquantiaed  results. 
A  little  blocking  is  evident,  but  otherwise  the  reconstruction  possesses  good  detail,  and  is  a  generally 
faithful  reproduction  of  the  original.  Figure  9  shows  the  results  when  the  derivative  spline  coefficients 
are  quantised  at  a  rate  of  2.04  hits/piml  including  overhead.  Here  the  detail  in  the  mother  and  child  has 
nsnained  quite  good  with  most  of  the  features  easily  recognised.  The  baekgrtxmd  has  bean  degraded  and  the 
blocking  is  more  evident.  The  increased  blocking  is  du*  to  the  subc^xtisel  quantization  of  the  initial 
renditions,  to  vfcich  the  overall  performance  is  quite  sensitive.  Nevertheless,  the  perfornanoe  of  this 
initial  system  compares  favorably  with  the  least  squares  splint  results  at  2  bits/pixel  as  shown  in 
Table  3  and  figure  10.  as  can  be  seen  the  error  in  the  optical  ^>line  is  elevated  for  both  the  unquantised 
and  quantised  versions.  Theea  acne  about  due  to  the  nature  of  the  simulation.  The  results  concerning  the 
derivative  properties  of  spline  apprenri  nations  as  outlined  in  the  appendix  are  true  far  the  continuous 
nodal  and  rot  necessarily  for  the  discrete  case  used  in  the  simulation.  Thus  the  actual  optical  ly  imple¬ 
mented  system  might  display  a  sli^itly  improved  performance.  It  is  also  eiqectari  that  the  uee  of  the  proper 
statistics  in  generating  a  lex  quantiser  would  greatly  improve  the  performance. 

panel  uaiens 


It  would  seem  that  spline  functions  are  quite  attractive  for  image  coding  purposes  from  both  a  performance 
and  an  implementation  viewpoint.  Ccnaeming  performance,  an  error  of  less  than  .5%  at  a  rate  of  approxi¬ 
mately  1  hit/pixel  is  certainly  competitive  with  the  orthogonal  transform  techniques.  An  optical  imple¬ 
mentation  has  bean  proposed  and  simulated  that  is  both  feasible  to  implement,  and  uould  provide  a  real  time 
implementation .  This  optical  inplementatioi  is  a  hybrid  process  since  it  combines  a  coherent  optical 
processor  for  image  construction,  with  a  oenbination  digital -noncoherent  processor  in  the  encoder. 

It  should  be  noted  that  the  errors  and  corresponding  data  rate  were  achieved  with  a  ton-optimal  quantizer 
arzl  furtter  work  is  mcessary  in  this  area.  Further  study  and  simulations  are  necessary  and  in  progress  to 
achieve  a  better  understanding  of  the  optical  implementation.  It  is  felt  that  these  efforts  should  pro¬ 
vide  fruitful  results. 


Appendix 

In  this  appendix,  tie  properties  of  the  derivatives  of  a  k^1  order  spline  approximation  to  an  image 
f  (x,y)  are  investigated.  It  will  be  shown  that  such  an  approximation  also  provides  a  least-squares  ajproxi- 
nation  to  the  derivatives  of  f(x,y)  by  taking  the  proper  divided  differences  of  the  spline  coefficients  and 
exporting  in  terns  of  the  proper  lower  order  splines.  In  other  words,  if 


Vn  .N  <x'y)  *  5<(£'x) 

*  J 


H*  V 
"xil  y£l 


S  N^n.yl 

Vi'*'*'*  Nj'k<!1'y) 


IA-1I 


is  a  least  squares  approximation  to  f(x,y)  an  l-l<x,y<l] . 
Then 


u  <x,y) 
y 


»  t£_i<l,x)  JNfc.jfn.x) 


(A-2) 


is  leastysquares  approximation  to 


f(x,y)on  [-l^x.y  ^1). 


Here 


0  0 
x  y 


3xl 


3* 

7 


and  V  is  an  s-b<N  matrix,  if  lyt.x)  and  N^tn.y)  are  NxN  matrices  g  is  given  by: 

-110 - -o' 

0-1  1  0  -  -  -  0 
•  • 

•  • 

•  • 

o  o-i  i  . 

where  h  is  tt*  knot  mash  width  of  the  spline. 
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The  analysis  will  be  perfonted  in  one  dimension  as  the  two  dimensional  equivalent  result  is  obtained 
inneri lately  using  a  direct  product  of  splines. 


The  proposition  is:  If  Sj^N(x)  is  a  least-squares  approximation  to  f(x)  on  the  interval  [-l<wl]  given  by 


^  SiNi,k(^'x>  *  f(x) 


-  N^i.x)  S 


(A-3) 


then  "  t^c_j(£'xj7''  £  is  a  least-squares  k-j**1  order  spline  approximation  to  D^f(x)  on  [~1<x<1]  . 


Proof: 


If  i*  «  least-squares  approximation  to  f  (x)  on  f-l<x<l]  then  the  vector  S  nust  satisfy 


rki“« 


(A-4) 


rk  ”  f  t^(I<x)dx 

-1 

f-J  ^f(x)l^(£,x)dx 


eo  that 


Similarly,  if  (x)  is  a  least-squares  approximation  to  D^f(x),  then  VS  *  Df 


3L  ^-i^x) 


Dxf  Cx)dx 


(A-5) 
(A-6) 
(A- 7) 


-1 


so  that 


vs  -  rwDf 


It  gust  be  shewn  that  S  given  by  (A-5)  inplles  (A-6)  .  Therefore  substituting  the  right  side  of  (A-5)  into 
the  left  side  of  (A-6 )  and  writing  f  explicitly,  one  obtains 


rk-l7rk1  J  \.x^hz)iz 


-1 


/VkI, 


-l 


JO^^xlVIfc1  J  *^(£,z)f(z)dadx 
-1 


which  is  [4]  i 


/  Vl^xK'S'*  k'1  dx. 


-1 


-1 


»it  a  least-squares  estimate  has  the  property  f(z)  -  (?(£,*)  g,  for  sane  a  [7,8], 
so  that 


^(i.x)^-1  J N^.zJfUldz  -  N^i.x)^  •  f(x) 
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Figure  (8) . 


Unquantized 
Optical  Spline 
Approximation 


Figure  (9) .  Quantized 

Optical  Spline 
Approximation 


Figure  (10) .  Quantized 

Least-Squares 
Spline  Appcoac 
i  nation. 


LEAST  SQUABES  SPLZtG 
(^quantized 

1.89  bits/pixal 

CERTVATIVE  SFLDE 

Onquantiaad 

2.04  bits/pixel 

ME  -  .2304  M5E  -  .2344 

MEE  -  .4684  ME  -  .6394 

Table  3. 
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Introduction 


The  Increasing  complexity  and  variety  of  image  sensors  has  been  the  source  of  Interest 
In  the  development  of  data  compression  for  images.  Image  data  has  become  one  of  the  most 
active  topics  of  research  In  digital  Image  processing  as  a  result  [1].  The  continued 
evolution  of  digital  circuitry  has  caused  the  focus  of  data  compression  research  to  lie  In 
digital  Implementations.  However,  there  Is  also  a  potential  for  optical  computations  In 
Image  data  compression,  as  was  demonstrated  In  the  concepts  of  Interpolated  DPCN  [2].  The 
method  of  DPCM  data  compression  Is  one  of  the  most  thoroughly  studied  techniques.  DPCM 
achieves  data  _comj>ress1on_by  .separating  the  Image  Information  Into  two  parts:  the  low- 
spatlaT  frequencies  and  the  high-spatial  frequencies!  Cow-spatial  frequencies  are  re-  "" 
talned  by  exploiting  their  predictability;  high-spatial  frequencies  are  retained  at  fewer 
significant  bits,  and  substantial  data  compression  Is  achieved.  Interpolated  DPCM  Is  a 
“  mechanism  for  separating  an  Image  Into  low-  and  high-  spatial  frequency  components,  with  a 
similar  amount  of  data  compression  being  achieved.  The  computations  to  achieve  the  sep¬ 
aration  can  be  Implemented  by  simple  Incoherent  optical  devices  (2]. 

The  greatest  amount  of  data  compression  can  be  achieved  only  by  adaptive  processing, 
t.c.,  processing  that  changes  as  local  Image  characteristics  change  II).  Thus,  the  most 
efficient  DPCM  compression  schemes  are  adaptive.  The  adaptive  approach  Is  not  easily  In¬ 
cluded  In  the  optical  processes  examined  for  Interpolated  DPCM,  however,  since  adaptation 
would  require  optical  responses  that  very  within  the  Image  plane.  To  achieve  adaptive  date 
compression  with  optical  components  requires  extremely  sophisticated  methods  of  adaptive 
optical  spline  interpolation  (3].  The  advantages  of  adaptive  computation  are  thus  offset 
,  by  the  extreme  complexity  required. to  Implement  them..  _ 

In  the  following  we  describe  a  system  which  has  the  capability  of  being  partially  adap¬ 
tive.  but  which  Is  much  less  complex  In  system  architecture  and  Implementation.  It  can  be 
considered  as  feasible  for  optical  Implementation  with  much  less  system  complexity  than  the 
optical  spline  Interpolation  systems  previously  examined  [3]. 

An  Edge-Detecting  Compression  Technique 

It  has  been  known  for  some  time  that  Image  data  compression  systems  can  be  successfully 
developed  around  the  separation  of  the  original  Image  Into  low-  and  high-frequency  compo¬ 
nents.  A  fundamental  consideration  In  such  a  system,  therefore.  Is  to  answer  the  question: 
by  what  criterion  do  we  define  the  separation  between  low-  and  high-spatial  frequencies? 

The  new  data  compression  technique  we  describe  below  answers  this  question  In  terms  of  a 
criterion  that  we  believe  Is  relevant  to  the  final  quality  of  an  Image,  that  Is  a  criterion 
concerning  visible  or  perceived  aspects  of  high-frequency  Information.  Ue  believe  that  the 
high-frequency  information  in  an  Image  Is  principally  associated  with  the  edges  of  objects 
In  the  original  scene.  Therefore,  If  we  can  successfully  represent  the  edges  of  objects,  . 
we  have  the  prospect  of  data  compression  that  Is  acceptable  In  visual  utility  to  the  human 
being  that  constitutes  the  final  end-user  of  the  compressed  data. 

A  schematic  diagram  of  the  proposed  data  compression  system  Is  seen  In  Figure  1.  There 
are  three  separate  parallel  paths;  structures  to  the  left  of  the  dotted  line  makeup  the 
compression  operations,  and  structures  to  the  right  of  the  line  are  the  reconstruction  oper¬ 
ations.  The  topmost  of  the  parallel  paths  Is  responsible  for  deriving  the  low-spatial  fre¬ 
quency  Information  of  the  Image  which  Is  to  be  compressed.  This  Is  done  in  a  manner  Iden¬ 
tical  to  that  developed  for  the  Interpolated  DPCM  technique  previously  discussed  [2).  A  set 
of  subsamples  are  extracted  from  the  original  Image.  For  example,  If  the  original  Image 
possessed  an  Intrinsic  resolution  equivalent  to  512  x  512  pixels,  the  subsamplcd  Image 
might  be  128  x  128,  and  would  be  created  by  extracting  every  4th  pixel  of  every  4th  line. 

The  subsamples  are  then  quantized  at  a  fixed  number  of  bits  and  transmitted.  At  the  re¬ 
ceiver  the  subsamples  are  used  to  create  a  low-frequency  version  of  the  original  Image. 

The  subsamples  are  Inserted  into  a  matrix  whose  size  Is  equal  to  the  resolution  In  pixels 
of  the  original  Image,  and  zeroes  are  Inserted  In  place  of  the  pixels  whleh  were  discarded 
In  the  subsampling  process;  e.g.,  for  the  case  above  every  4th  pixel  and  line  of  the  128  x 
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128  subsaaples  Mould  bo  Insorttd  In  t  Matrix  of  zorots.  This  Matrix  of  subsaaples  Is  thon 
Intorpolatad  to  raplact  tho  zorots  by  Intorpolatod  subsaMpIo  valuta. 


As  dlseussod  In  tho  papor  describing  Intorpolatod  DPCK,  both  tho  subsaapllng  and  Inter- 
polatlon  oporatlons  can  bo  carrled-out  by  slnplo  Incohoront  optical  processes..  For  exaaple. 
subsaMplIng  can  bo  iMploMontod  with  a  slnplo  focal-plane  scanning  aeehanlsa.,  o.g.,  pick-off 
Mirrors  and  aporturos,  and  Interpolation  can  bo  iMplenented  via  an  out-of-focus  apodlzod 
lens  Inaging  a  Matrix  of  displayed  subsanplos. 


Tho  botton  two  parallel  paths  in  Figure  1  art  the  portions  of  tho  data  conpresslon  systew 
which  represent  tho  high-spatial  frequency  Infomatlon  In  the  original  Inago.  Tho  bottoa- 
nost  path  consists  of  tho  noehanlsas  to  detect  edges.  The  rationale  for  the  level  slicing 
and  edge  detection  Is  seen  In  Figure  2.  A  coarse  slicing  of  levels  causes  a  steep  gradient 
to  undergo  a  change  In  a  nuaber  of  levels  In  a  short  distance  of  space.  An  edge  detector 
which  can  detect  the  existence  of  aajor  level  changes  In  a  saall  spatial  region  has  Iden¬ 
tified  laege  areas  which  Must  be  accurately  represented  to  retain  the  sensitivity  to  edges 
of  the  huaan  visual  systea.  It  Is  In  this  sense  that  the  systea  of  Figure  1  aay  be  con¬ 
sidered  adaptive.  I.e.,  It  concentrates  the  high-frequency  coding  In  regions  of  edges,  edges 
being  tha  nost  severe  probleas  for  high-frequency  representation. 


Figure  2:  Adaptive  Effbcts  of  level  Slice 


The  Inforaatlon  concerning  edge  locations  Is  used  to  ’control"  the  switch  In  the  Middle 
path  of  Figure  1.  The  switch  Is  not  literal  but  syabotle  of  the  following  selection  proc¬ 
ess.  In  regions  where  the  low-frequency  Inage  doMlnetes  (that  is.  regions  which  are  devoid 
of  edges),  it  Is  assuaed  that  the  low-frequency  representation  created  by  the  upperaost 
path  Is  an  adequate  quality  laage.  However,  In  regions  where  the  edges  are  Inportant  (that 
is,  regions  where  edges  are  detected)  the  actual  value  of  Inage  pixels  are  selected  froa 
the  laage  region  and  transnltted.  These  literal  edge  pixel  values  are  used  to  replace  the 
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low-frequency  representation  of  the  edge  pixels  that  would  be  constructed  by  the  low- 
frequency  interpolator.  The  system  has  the  overall  behavior  of  trying  to  represent  the 
Image  by  low-frequencies,  but  with  adaptive  mechanisms  to  select  and  retain  the  most  Im¬ 
portant  high-frequency  content  for  the  edge  structures. 

The  Implementation  of  the  bottoewost  path  In  figure  2  can  also  be  achieved  by  optical 
processing,  or  In  combination  with  discrete  focal  plane  sensors.  For  example,  the  level 
slicing  operation  has  been  demonstrated  by  the  nonlinear  halftone  work  of  Sawchuk  [4).  The 
edge  detection  operation  can  be  broken  In  two  steps:  edge  calculations  and  edge  thresholds. 
The  edge  calculation  Is,  In  essence,  the  convolution  of  the  level-sliced  Image  with  a  par¬ 
ticular  point-spread-function,  and  a  convolution  can  be  Implemented  optically  [51.  The 
edge  threshold  Is  a  trivial  case  of  level-slicing  and  can  be  Implemented,  again,  by  the 
Sawchuk  method.  Alternatively,  the  threshold  can  be  calculated  digitally.  In  this  case, 
the  Image  output  from  the  level  slice  would  be  sampled  with  a  discrete  sensor  array,  e.g., 
an  array  of  silicon  photo-diodes  or  Imaging  CCO  elements.  A  simple  digital  processor  would 
access  the  sensor  array  elements,  perform  analog-to-dlgltal  conversions  and  digitally  Im¬ 
plement  the  threshold  operations. 

The  optical  Implementation  of  the  level-slice  and  edge  calculation  would  require  a  step 
In  converting  the  Incoming  Image  from  non-coherent  to  coherent  Illumination,  since  the  pro¬ 
posed  optical  processors  require  coherent  sources.  There  are  a  number  of  ways  of  achieving 
this  conversion,  however,  and  we  will  not  dwell  upon  the  specifics.  , 

The  major  computations  for  data  compression  In  Figure  1  can  be  Implemented  optically. 

The  actual  'transmission  of  compressed  Image  values  would  most  likely  be  In  digital  format, 
given  the  Increasing  preference  for  digital  data  transmission  techniques.  Thus,  following 
the  computations  for  data  compression  the  compressed  data  must  be  coded  for  transmission, 
and  digital  hardware  will  be  employed.  The  final  system  Is,  thus,  most  appropriately  a  hy¬ 
brid  optical/digital  system.  In  the  following  section  we  Indicate  some  of  the  digital 
coding  considerations. 


Simulation  of  the  Adaptive  System 

To  assess  the  efflclacy  of  an  edge-detecting  optical  compression  system,  a  simulation  of 
the  architecture  In  Figure  1  was  carried  out  with  digital  Image  processing  techniques,  that 
Is,  optical  convolutions  for  Interpolation  were  implemented  by  digital  convolutions,  optical 
level-slicing  was  implemented  with  a  digital  requantization,  etc. 

In  the  simulation  of  the  edge-detection  compression  on  an  Image,  the  following  parameters 
are  Important: 

(1}  Subsampling  increment,  the  spacing  between  pixels  retained  for  reconstruction 
of  a  low-frequency  Image; 

(2)  Number  of  slice  levels,  the  number  of  uniform  steps  Into  which  the  pixel  radi¬ 
ance  values  are  divided; 

(3)  Edge  threshold  measure,  the  number  of  steps  In  level  which  are  chosen  to  rep¬ 
resent  an  edge; 

(4)  Subsampling  quantization,  the  number  of  bits  to  quantize  a  sub-sample  for 
transmission; 

(5)  Edge  quantization,  the  number  of  bits  to  quantize  a  pixel  chosen  as  an  edge 
for  transmission; 

(6)  Edge  coding,  the  method  chosen  to  represent  the  location  of  detected  edges. 

Ideally,  the  optimum  values  for  those  parameters  would  be  chosen  on  the  basis  of  a  com¬ 
prehensive  theory  of  the  system.  No  such  theory  has  been  developed,  however,  and  we  have 
initially  chosen  values  for  the  parameters  from  experimentation  with  the  method. 

Figure  3  shows  an  original  digital  Image,  sampled  at  2S6  x  256  pixels,  with  each  pixel 
quantized  at  8  bits.  For  the  simulation  of  the  compression  system,  the  following  param¬ 
eters  were  chosen: 


(1)  Thesubsaapl Ing  Increment  was  every  4th  line  and  every  4th  pixel.  The  Image 
interpolated  at  the  receiver  from  this  4x4  subsampling  Is  shown  In  Figure  4. 
The  loss  of  resolution  and  the  Inability  to  Identify  objects  in  this  Image  Is 
pronounced.  The  sub-sampling  quantization  was  4  bits,  l.e.,  the  Interpolation 
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shown  In  Figure  4  was  created  f  row  4-blt  reguantlzatlon  of  the  subsaaples 
selected  f row  Figure  3. 


Flgurt  3:  Original  laaga  (256  x  256)  Figure  4:  Reinterpolation  froa  4x4 

Subsaapllng  of  Figure  3. 

(2)  For  level-slicing,  16  sllcts  wort  .chosen,  tho  16-levels  bolng  unlforaly 
distributed  botwoon  tho  alnlaua  and  aaxlaua  pixel  grey  values  (0  and  255). 

(3)  An  edge  was  assuaed  to  bo  5  or  aore  steps  In  the  slices  produced  by  the 
level  slicing.  Figure  5  shows  the  asp  of  edges  produced  by  using  an  edge* 
detector  with  this  S-step  threshold  criterion. 

(4)  Each  pixel  selected  as  an  edge  was  quantized  at  4  bits.  The  result  of 
Inserting  these  4-blt  edge  pixels  Into  Figure  4,  at  the  edge  locations 
defined  by  Figure  5,  Is  shown  In  Figure  6,  the  reconstructed  laage. 


Figure  5:  Edge  Nap  for  Figure  3  Figure  6:  Reconstruction:  Bit  Rate  • 

1.385  bits/pixel 

Two  processes  critical  to  the  systea  operation  are  the  edge  detector  and  the  coding  of 
edge  locations.  A  nuaber  of  edge  detectors  have  been  oxperiaentally  evaluated,  including 
soae  of  the  nost  coaaonly  found  in  current  literature,  e.g.,  Laplaelan  operator,  Sobel 
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operator,  [6]  ate.  Experience  with  these  different  edge  operators,  and  others  devised  for 
the  simulation,  has  shown  that  there  Is  little  overall  difference  In  compression  performance 
among  a  variety  of  choices  for  the  edge  operator.  Consequently,  the  operator  which  Is  most 
direct  or  simple  to  implement  Is  the  rationale  for  choosing  between  the  different  edge  op¬ 
erators.  The  coding  of  edge  locations  does  not  offer  as  many  options,  however.  It  Is 
necessary  to  represent  the  location  of  an  edge  In  as  few  bits  as  possible.  In  order  that  the 
bits  saved  by  the  edge-detecting  process  not  be  lost  In  transmitting  edge  locations.  This 
also  requires  a  compromise  In  ease  of  Implementation.  A  simple,  and  yet  relatively  effi¬ 
cient  technique  is  run-length  coding  [6,7],  coding  the  length  of  a  run,  where  a  run  Is  de¬ 
fined  to  be  the  number  of  adjacent  pixels  which  are  alike,  that  Is,  either  edge  or  non-edge 

pixels.  Since  edges  are  less  frequent  In  occurence  than  non-edges,  a  shorter  code  word  Is 

required  for  edges  than  non-edges.  In  Figure  6  a  run-length  code  of  2-blts  for  edges  and 
S-blts  for  non-edges  was  chosen. 

The  compression  efficiency  achieved  In  the  creation  of  Figure  6  Is  summarized  as  follows, 
with  the  4x4  subsampling  at  4  bits  per  sample,  a  total  of  .25  bits  per  pixel  are  required. 
The  run-length  code  for  edge  locations  In  Figure  6  amounted  to  5211  edge  runs  and  5371  non¬ 
edge  runs,  and  the  total  code  bits  In  the  run-length  code  is  equal  to  0.5688  bits  per  pixel. 
Finally,  there  are  9276  detected  edge  values  in  Figure  6,  and  quantizing  these  edge  values 
at  4  bits  each  leads  to  a  total  of  0.5661  bits  per  pixel.  The  total  bit  requirements  to 
represent  Figure  6  Is,  thus,  1.385  bits  per  pixel.  Note  that  In  each  of  the  above  Items, 
"bits  per  pixel*  refers  to  bits  per  pixel  of  the  original  Imaoe  (not  per  subsampled  or  edge 
pixel).  Thus,  Figure  6  represents  a  bit  reduction  from  8.0  bits  per  pixel  to  1.385  bits 
per  pixel . 

Figure  7  and  Figure  8'represent  another  pair  of  Images,  sampled  at  512  x  512  pixels  and 
processed  with  parameter  choices  Identical  to  Figure  3.  The  resulting  bit  rate  In  Figure  8 

Is  0.922  bits  per  pixel,  as  compared  to  8  bits  per  pixel  In  the  original  of  Figure  7. 


Figure  7:  Original  Image  (512  x  512)  Figure  8:  Reconstruction:  Bit  Rate  ■ 

0.922  bits/pixel 

The  visual  properties  of  both  Figures  6  and  8  Is  similar.  In  that  low-contrast  non- 
structured  details  present  In  the  original  Image  are  replaced,  through  the  subsampling/ 
Interpolation  process,  by  blurred  shapes.  The  edges  that  define  and  outlln e  objects  are 
retained,  however,  and  all  major  objects  visible  In  the  original  are  visible  In  the  com¬ 
pressed  and  reconstructed  Image.  There  are  some  visible  artifacts  at  the  edges  of  the  re¬ 
constructed  Images. 


Summary 

We  have  described  an  Image  data  compression  system  which  uses  edge-detecting  mechanisms 
to  adaptively  code  Image  features  to  which  the  eye  Is  most  sensitive.  There  are  some  ob¬ 
vious  directions  In  which  future  research  In  this  method  should  be  pointed.  For  example, 
the  use  of  a  physiological  model  of  edge-detection  would  assure  that  edge  detection  Is 
sensitive  to  properties  to  which  the  human  eye  Is  also  sensitive.  A  general  model  of  com¬ 
pression  performance,  as  a  function  of  the  system  parameters  discussed  above.  Is  also 
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needed.  In  order  to  oprielze  the  overall  systea  performance  for  alnlaua  bit-rate  and  maxl- 
aua  image  quality.  A  general  goal  for  this  type  of  systea  Is  to  achieve  an  average  bit- 
rate  Of  1.0  bits  per  pixel  with  Image  quality  virtually  Identical  to  the  original  laage. 
Whereas  this  goal  was  not  achievable  with  the  simpler  method  of  Interpolated  OPCM,  we  be¬ 
lieve  the  aethod  discussed  herein  aay  offer  the  potential  to  achieve  this  goal  In  optical 
coaponentry  for  laage  data  compression. 
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Abstract 

Image  data  compression  methods  have  been  dominated  by  digital  computations.  In  this  pa¬ 
per  we  discuss  a  data  compression  concept  which  employs  optical  computations  as  part  of  the 
compression  process.  Simple  optical  processes  are  used  to  separate  an  image  into  low 
frequency  and  high  frequency  components.  These  components  are  then  subjected  to  temporal 
compression,  for  multiframe  imagery,  by  using  a  DPCM  frame-buffer  structure.  Simulations 
of  the  process  are  shown,  with  reasonable  performance  being  seen  at  multiple  frame  com¬ 
pression  rates  of  1.75  bits  per  pixel. 

I  n  troduct i on 

Image  data  compression  is  a  very  active  topic  of  research  in  image  processing.  This  is 
not  surprising  when  one  considers  the  extensive  sources  of  imagery  currently  being  used  for 
a  variety  of  purposes,  e.g.,  LANDSAT  imagery,  medical  imaging,  non-destructive  testing,  etc. 
Many  such  imagery  situations  require  either  the  point-to-point  transmission  of  imagery  and/ 
or  the  archival  storage  of  imagery.  Given  the  economic  costs  associated  with  transmission 
or  archival  storage  of  masses  of  image  data,  the  desirability  of  image  data  compression  to 
reduce  these  costs  is  obvious. 

Image  data  compression  schemes  have  been  dominated  by  digital  processes.  That  is,  the 
required  computations  for  an  image  data  compression  scheme  have  been  Implemented  by  digital 
processes.  This  is  not  surprising,  given  the  emphasis  on  compression  schemes  whicn  require 
the  inherent  flexibility  of  a  digital  scheme,  e.g.,  adaptive  cosine  transform  compression. 
However,  an  Investigation  of  different  architectures  for  image  data  compression  can  reveal 
feasible  data  compression  methods  for  which  optical  computations  can  replace  digital  compu¬ 
tations.  The  successful  discovery  of  such  architectures  Is  interesting  because  they  would 
represent  an  extension  of  the  repertoire  of  optical  processing  functions  into  new  situations 
where  optical  and  digital  processes  would  be  directly  competitive. 

One  such  success  in  the  search  for  image  data  compression  architectures  with  optical  im¬ 
plementation  is  the  IDPCH  method  (1].  This  is  a  data  compression  scheme  which  functions 
analogously  to  conventional  digital  DPCM  compression  [2],  except  that  the  specific  compres¬ 
sion  steps  are  Implemented  by  optical  processes. 

Interframe  data  compression  methods  are  applicable  only  to  imagery  sources  which  are 
temporal  in  variation,  e.g.,  the  successive  frames  of  a  commercial  broadcast  television  sig¬ 
nal.  Obviously,  if  the  extent  of  cnanges  in  successive  frames  of  a  multiframe  sequence  is 
small,  then  there  will  be  great  temporal  redundancy  of  the  spatial  information  recorded  in 
the  image  frames.  This  is  the  purpose  of  an  interframe  compression  system:  to  remove  the 
temporal  redundancy  of  the  imagery's  spatial  data. 

In  the  following  article  we  examine  a  data  compression  system  which  combines  optical 
spatial  processing  with  temporal  processing  for  interframe  data  compression. 

Interframe  Architecture 

The  basic  structure  which  we  propose  for  the  hybrid  digital/optical  interframe  compres¬ 
sion  system  is  shown  in  Figure  1.  The  upper  portion  of  the  figure  represents  the  portion 
of  the  system  responsible  for  data  compression,  whereas  the  lower  portion  is  the  reconstruc¬ 
tion  system.  The  portions  of  the  system  using  digital  and  optical  componentry  are  clearly 
segregated,  as  well.  We  will  discuss  the  overall  operation  of  the  schematic  and  then  des¬ 
cribe  the  actual  componentry  implementation  of  the  various  blocks  in  the  diagram. 

To  the  left  of  the  nodes  marked  0  In  Figure  1  is  the  portion  of  the  system  where  spa¬ 
tial  data  redundancy  is  eliminated.  The  image  is  subsampled,  and  the  subsamples  are  used 
to  reconstruct  a  low-frequency  version  of  the  original  image  using  bilinear  interpolation 
of  the  subsamples.  The  1 ow- f requency  version  of  the  image  is  then  subtracted  from  the 
original  image.  This  is,  of  course,  equivalent  to  a  hign-pass  filtering  of  the  original 
image . 
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To  the  right  of  the  nodes  marked  ©in  Figure  1  is  the  portion  of  tne  system  where 
temporal  data  reaundancy  is  eliminated.  The  quantizer  and  feedback  structure  are  identi¬ 
fiable  as  similar  to  a  conventional  DPCM  image  data  compression  system  [3].  However,  tnere 
is  one  great  difference  between  the  structure  in  Figure  1  and  conventional  DPCM  data  com¬ 
pression:  parallel  vs.  serial  data  flow.  in  a  conventional  DPCM  system  a  set  of  successive 
sequential  samples  are  extracted  from  the  image  and  saved  in  the  data  buffer  for  prediction 
and  differencing  with  succeeding  image  samples.  However,  in  Figure  1  it  is  successive 
frames  of  imagery  which  are  buffered,  used  for  prediction,  and  differenced.  Thus,  the  ar- 
cnitecture  of  Figure  1  signifies  para  1  lei  flow  of  image  pixels  around  the  quanit2ation/feed- 
back  loop,  as  well  as  parallel  di f ferenci ng  at  the  nodes  marked  ©  .  Rather  than  specific 
samples  from  an  image,  the  DPCM  loop  in  Figure  1  represents  parallel  image-plane  to  image- 
plane  operations.  It  is  equivalent  to  a  bank  of  N  x  N  serial  DPCM  processors  operating  in 
parallel  (where  K  i  II  is  the  pixel  resolution  of  the  image  plane). 

Likewise,  the  nodes  marked  ©  in  Figure  1  represent  similar  parallel  operations  in  the 
reconstruction  process.  That  is,  a  frame  buffer  saves  each  successive  frame  and  sums  it, 
in  parallel  with  the  succeeding  frame.  Thus,  the  operation  is  identical  to  a  bank  of  N  x  N 
serial  DPCM  reconstructors  operating  in  parallel.  Both  low-frequency  subsamples  and  the 
high-frequency  differences  are  reconstructed  in  this  fashion. 

The  final  reconstruction  step  is  to  generate  a  low-frequency  image  version  from  the  sub¬ 
samples,  again  using  a  bilinear  interpolation.  Finally,  the  high-frequencies  are  reinserted 
into  the  low-frequency  image  through  the  final  summation  step. 
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Componentry  Considerations 

The  implementation  of  Figure  1  can  be  segregated  into  optical  and  digital  components, 
with  clearly  defined  interfaces  Detween  eacn.  For  example,  in  Figure  1  the  nodes  marked 
(T)  represent  where  optical  information  enters  a  region  of  the  system  that  is  dominated  py 
digital  processing.  Likewise,  in  the  re  con s true  tor ,  everything  to  the  right  of  the  point 
marked  Q)  is  optical  processing,  with  digital  processing  to  the  left  of  ©. 

We  now  summarize  the  actual  component  prospects  at  each  of  the  individual  blocks  in  the 
diagram. 

(1)  Subsamp 1 e .  A  number  of  mechanisms  can  optically  subsample  an  image  plane. 

For  example,  if  a  linear  sensor  array  is  used,  with  the  array  optically 
"push-broomed"  across  the  image  plane,  then  proper  timing  of  the  array  read¬ 
out  can  subsample  the  image  plane,  e.g.,  extract  every  nth  pixel  from  every 
n*h  line. 

(2)  Bilinear  interpolation.  An  optical  bilinear  interpolation  can  be  constructed 
by  writing  the  proper  apodization  function  across  a  lens,  and  then  throwing 
the  lens  out  of  focus  [41 .  For  Figure  1,  the  subsamples  extracted  would  be 
imaged  on  a  matrix  display,  such  as  a  CRT,  and  the  optical  interpolation  ex¬ 
ecuted  from  mis-focus  on  the  display. 

(3)  Image  di f ference.  The  difference  between  the  original  image  and  the  low- 
frequency  version  is  a  pi xe 1 -by- p i xe  1  difference  between  the  two  image 
planes.  Electro-optical  .mechanisms  for  this  differencing  operation  have  been 
demonstrated  recently,  including  electro-optical  effects  In  a  liquid  crystal 
[5],  and  the  use  of  channel-plate  image  intensifiers  [6], 

(4)  OPCM  processors.  Although  the  conceptual  processes  in  these  components  are 
parallel  image-plane  to  image-plane  operations,  it  is  direct  to  see  how  they 
could  be  implemented  serially.  For  example,  subsamples  extracted  from  an 
array  would  go  directly  into  the  DPCM  loop,  the  loop  being  a  "pass-through" 
around  the  quantizer  to  the  frame  buffer  until  the  frame  buffer  is  loaded 
with  its  reference  image.  Then  the  frame  buffer  data  would  be  extracted, 
passed  into  a  working  quantization  loop,  etc.  The  data  could  be  processed 
serially  through  a  single  OPCM  processor,  the  data  being  serially  extracted 
from  the  frame  buffer  and  synchronized  pixel-for-pixel  with  the  image  plane 
subsamples.  Thus,  although  processed  serially  through  a  single  DPCM  proc¬ 
essor,  the  pixel-by-pixel  synchronization  between  subsamples  and  frame  buffer 
would  have  the  same  effect  as  N  x  N  parallel  DPCM  processors  operating  at  a 
very  low  data  rate.  Note  that  sensing  for  subsamples  with  a  suitable  detector 
(such  as  a  linear  array)  could  provide  data  in  exactly  a  suitable  format  for 
input  to  the  DPCM  processes.  Obviously,  all  operations  in  the  DPCM  processors 
would  be  digital.  Similar  comments  can  be  directly  applied  to  the  DPCM  loop 
which  temporally  processes  the  high-frequencies. 

(5)  Reconstruction.  At  the  nodes  marked  ©  the  incoming  data  circulates  through 
frame  buffers  and  is  summed,  with  incoming  image  plane  pixels  synchronized  to 
the  corresponding  frame  buffer  pixels,  to  regenerate  both  low  and  high  fre¬ 
quencies  . 

(6)  Reconstruction  bili nearinterpolator.  The  processing  here  is  the  same  as  in 
tne  compression  step.  The  recons tructed  samples  would  be  written  on  a  matrix 
display  which  would  be  imaged  out  of  focus. 

(7)  Summation .  Again,  a  suitable  electro-optical  effect  would  be  used  to  achieve 
summation,  as  in  the  corresponding  difference  operation  in  the  compression 
portion  of  the  system.  This  would  require  conversion  of  the  samples  from  the 
high-frequency  OPCM  reconstructor  into  light  intensitites  for  the  electro- 
optical  summation. 

As  can  be  discerned  from  this  discussion,  the  overall  architure  mixes  optical  and  digital 
processes  in  a  hybrid  system  for  interframe  compression. 

Simulation  Results 


To  demonstrate  the  archi tecture ' s  feasibility,  a  series  of  digital  simulation  experiments 
was  carried  out.  That  is,  the  optical  interpolations  were  replaced  by  digital  interpola¬ 
tions,  the  optical  differences  by  digital  differences,  etc.  The  resulting  digital  simula¬ 
tion  was  carried  out  in  the  Digital  Image  Analysis  Laboratory  of  the  University  of  Arizona, 
Department  of  Systems  Engineering. 
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The  source  data  'for  the  simulation  consistec  of  a  sequence  of  K  digitized  fr*r.,es  from  a 
television  broadcast  of  Walter  Cronkite.  The  frames  were  digitized  at  256  x  256  pixels  res¬ 
olution,  with  8  bits  of  intensity  per  pixel. 

The  14th  frame  NMSE/BR  performances  for  Walter  Cronkite  Images  are  summarized  in  Table  1, 
and  the  NMSE  performances  across  each  frame  for  various  bit  rates  are  given  by  Graph  1.1  to 
1.4,  where  tne  normalized  MSE(NMSE)  as  an  objective  image  quality  measure  and  the  overall 
image  quality  measure  and  overall  bit  rate  (BR)  as  system  performace  measure  are  defined  as 
fol lows  : 


( 1 )  NMSE  * 

(2)  BR  *  l  log2(LFQ)  +  |  1 og g ( H FQ ) 

LFQ:  low  frequency  quantization  level 

HFQ:  high  frequency  quantization  level 

According  to  Table  1,  the  NMSE  performances  are  almost  oarallel  to  the  LFQ  levels.  In 
other  words,  for  each  LFQ  level,  NMSE  performances  are  almost  the  same  within  the  range  of 
15  or  less.  Also,  the  HFQ  levels  have  little  effect  on  the  HMSE  performances  of  the  14th 
frame  at  the  same  LFQ  level.  On  the  contrary,  for  the  same  HFQ  level,  the  NMSE  performances 
improve  considerably  along  with  the  increasing  LFQ  levels.  From  the  transitions  of  HMSE  per- 
performances  across  each  frame  for  various  combinations  of  HFQ  and  LFQ  as  shown  in  Graph  1.1 
-  1.4,  it  can  be  said  that  it  is  essential  to  allow  the  larqe  LFQ.  levels  (8  or  16)  for  the 
acceptable  objective  image  quality,  regardless  of  the  HFQ  levels.  The  subjective  image 
qualities  associated  with  1.5  bits/pixel  (HFQ  *  2,  LFQ  *  8)  and  1.75  bits/pixel  (HFQ  *  2, 

LFQ  *  16)  are  quite  good  as  shown  in  Figures  2  and  3.  Figures  4  and  5  show  the  14th  frame 
reconstructions  for  the  cases  of  BR  *  1.25  bits/pixel  (HFQ  *  2,  LFQ  »  4)  and  BR  *  1.0  bits/ 
pixel  (HFQ  «  2,  LFQ  *  2),  in  which  some  artifacts  due  to  the  motion  displacements  are  ob¬ 
served  around  his  shoulder  and  his  head,  and  at  the  center  of  his  face  and  his  chin. 

In  addition,  the  transitions  of  NMSE  performances  for  BR  *  1.0  bit/pixel  in  Graph  1.1 
shows  very  good  NMSE  performances  about  1  -  25  up  to  the  8th  frame.  Due  to  the  large  motion 
Involved  between  the  8th  and  9th  frames,  the  subsequent  NMSE  performances  are  deteriorated 
rapidly.  Thus,  if  we  allow  the  larger  LFQ  levels  such  as  8  or  16  at  the  9th  frame,  the 
errors  due  to  the  motion  displacement  can  be  made  less  than  the  case  of  LFQ  •  2.  In  other 
words,  a  temporally  adaptive  quantization  scheme  can  improve  the  NMSE  performances  further 
more  at  the  relatively  small  cost  of  the  BR  performances. 

Conclusions  and  Further  Research 

The  conclusions  and  further  research  areas  can  be  summarized  as  follows: 

Cone  1  us i ons 

(1)  The  reasonable  subject! ve  -and  objective  image  qualities  (NMSE  -  0.95  -  1.35  for 
the  14th  frame  of  Walter  Cronkite  images  are  obtained  at  the  BR  *  1.75  bits/ 
pixel  or  1.5  bits/pixel  through  the  digital  simulations  of  the  proposed  hybrid 
interframe  data  compression  scheme. 

(2)  The  multi-frame  subjective  and  objective  image  qualities  are  mainly  determined 
by  the  magnitude  of  the  motion  displacement  between  frames.  In  the  proposed 
hybrid  interframe  system,  it  is  essential  to  allow  the  large  LFQ  levels  (8  or 
16)  for  the  acceptable  image  quality,  regardless  of  the  HFQ  levels,  because  the 
large  motion  is  constituted  in  the  low  frequency  component.  It  seems  that  the 
contribution  of  the  high  frequency  component  to  the  reconstructed  frame  is 
jeopardized  by  the  motion  displacement  between  frames. 

Future  research 

(1)  Development  of  a  Motion  displacement  measure  and  motion  detector. 

(2)  Temporally  adaptive  scheme  based  on  the  motion  displacement  measure. 

(3)  Detailed  system  componentry  consideration:  Electrical-Optical  devices. 
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Simulation  Results  for  Hybrid  (Optical/Digital) 
Interframe  Data  Compression  Scheme 
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